RE: [PATCH] rtlanal, i386: Adjust pattern_cost and x86 constant cost [PR115910]

2025-04-03 Thread Roger Sayle
I agree returning to the GCC 14 behaviour is the best approach given the current stage. > -Original Message- > From: Jakub Jelinek > Sent: 03 April 2025 09:16 > To: Roger Sayle > Cc: 'Richard Biener' ; 'Jan Hubicka' ; 'Uros > Bizjak' ;

RE: [PATCH] rtlanal, i386: Adjust pattern_cost and x86 constant cost [PR115910]

2025-04-03 Thread Roger Sayle
issue (where the current trunk implementation is typically more correct than GCC 14's). Thoughts? > -Original Message- > From: Jakub Jelinek > Sent: 02 April 2025 12:30 > To: Richard Biener ; Jan Hubicka ; Uros Bizjak > ; Roger Sayle ; Richard > Sandiford >

[gcc r15-7473] Synchronize include/dwarf2.def with binutils

2025-02-11 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:0f8fd6b336161ed0582edb08dbe6ea1932290a75 commit r15-7473-g0f8fd6b336161ed0582edb08dbe6ea1932290a75 Author: Roger Sayle Date: Tue Feb 11 12:21:43 2025 + Synchronize include/dwarf2.def with binutils The contents of include/dwarf2.def have diverged between

[PATCH] Synchronize include/dwarf2.def with binutils

2025-02-10 Thread Roger Sayle
file by copying the definition of DW_CFA_AARCH64_negate_ra_state_with_pc from binutils, restoring the ability to build a combined source tree. Tested on x86_64-pc-linux-gnu with "make bootstrap". Ok for mainline? 2025-02-10 Roger Sayle include/Chang

[gcc r15-3342] i386: Support read-modify-write memory operands in STV.

2024-08-31 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:bac00c34226bac3a95979b21dc2d668a96b14f6e commit r15-3342-gbac00c34226bac3a95979b21dc2d668a96b14f6e Author: Roger Sayle Date: Sat Aug 31 14:17:18 2024 -0600 i386: Support read-modify-write memory operands in STV. This patch enables STV when the first operand

[x86_64 PATCH] Support read-modify-write memory operands in STV.

2024-08-31 Thread Roger Sayle
xmm0 vmovdqa %xmm0, m(%rip) ret This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-08-31 Roger Sayle gcc/ChangeLog * config/i386/i386-feature

[gcc r15-3281] i386: Support wide immediate constants in STV.

2024-08-28 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:3cb92be94e6581697369eeafdb67057c8cfba73f commit r15-3281-g3cb92be94e6581697369eeafdb67057c8cfba73f Author: Roger Sayle Date: Wed Aug 28 21:19:28 2024 -0600 i386: Support wide immediate constants in STV. This patch provides more accurate costs/gains for (wide

[gcc r15-3162] i386: Update STV's gains for TImode arithmetic right shifts on AVX2.

2024-08-25 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:07d62a1711f3e3bbdd2146ab5914d3bc5e246509 commit r15-3162-g07d62a1711f3e3bbdd2146ab5914d3bc5e246509 Author: Roger Sayle Date: Sun Aug 25 09:14:34 2024 -0600 i386: Update STV's gains for TImode arithmetic right shifts on AVX2. This patch t

[x86_64 PATCH] Update STV's gains for TImode arithmetic right shifts on AVX2.

2024-08-24 Thread Roger Sayle
ithout --target_board=unix{-m32} with no new failures. No new testcase (yet) as the code for both the vector and scalar forms of the above function are still suboptimal so code generation is in flux, but this improvement should be a step in the right direction. Ok for mainline? 2024-08-24 Roger Sayle

[gcc r15-2940] i386: Improve split of *extendv2di2_highpart_stv_noavx512vl.

2024-08-15 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:b6fb4f7f651d2aa89548c5833fe2679af2638df5 commit r15-2940-gb6fb4f7f651d2aa89548c5833fe2679af2638df5 Author: Roger Sayle Date: Thu Aug 15 22:02:05 2024 +0100 i386: Improve split of *extendv2di2_highpart_stv_noavx512vl. This patch follows up on the previous

[x86_64 PATCH] Support wide immediate constants in STV.

2024-08-15 Thread Roger Sayle
instruction. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-08-15 Roger Sayle gcc/ChangeLog * config/i386/i386-features.cc (timode_immed_const_gain): New

[x86 PATCH] Improve split of *extendv2di2_highpart_stv_noavx512vl.

2024-08-15 Thread Roger Sayle
which applies when not performing the above optimization, i.e. on TARGET_XOP. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-08-15 Roger Sayle Uros B

RE: [PATCH] Re-add calling emit_clobber in lower-subreg.cc's resolve_simple_move.

2024-08-13 Thread Roger Sayle
Hi Xianmiao, I have no objection to reverting that original patch, if it was indeed made obsolete by later changes to the i386 backend. The theory at the time was that it was possible for backends to define mov instructions that emitted clobbers if necessary, but it's very difficult for a backen

[gcc r15-2880] PR target/116275: Handle STV of *extenddi2_doubleword_highpart on i386.

2024-08-11 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:7a970bd03f1d8eed7703db8a8db3c753ea68899f commit r15-2880-g7a970bd03f1d8eed7703db8a8db3c753ea68899f Author: Roger Sayle Date: Mon Aug 12 06:52:48 2024 +0100 PR target/116275: Handle STV of *extenddi2_doubleword_highpart on i386. This patch resolves PR target

[x86 PATCH] PR target/116275: Handle STV of *extenddi2_doubleword_highpart

2024-08-11 Thread Roger Sayle
h and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-08-11 Roger Sayle gcc/ChangeLog PR target/116275 * config/i386/i386.md (*extendv2di2_highpart_stv_noavx512vl): New define_insn_and_split to handle the STV conversion of the DImode pa

[gcc r15-2816] i386: Tweak ix86_mode_can_transfer_bits to restore bootstrap on RHEL.

2024-08-08 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:4d44f3fc387815eb232d7757352857993a1d21d9 commit r15-2816-g4d44f3fc387815eb232d7757352857993a1d21d9 Author: Roger Sayle Date: Thu Aug 8 11:16:29 2024 +0100 i386: Tweak ix86_mode_can_transfer_bits to restore bootstrap on RHEL. This minor patch, very similar to

[x86 PATCH] Tweak ix86_mode_can_transfer_bits to restore bootstrap on RHEL.

2024-08-08 Thread Roger Sayle
DFmode being "non-literal types in constant expressions". This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, with no new failures. Ok for mainline? 2024-08-08 Roger Sayle gcc/ChangeLog * config/i386/i386.cc (ix86_mode_can_transfer_bit

RE: [x86_64 PATCH] Refactor V2DI arithmetic right shift expansion for STV.

2024-08-07 Thread Roger Sayle
e has been committed as obvious. Sorry again for the inconvenience. Tested on x86_64-pc-linux-gnu with RUNTESTFLAGS="dg.exp=sse2-pr85572-1.C". 2024-08-07 Roger Sayle gcc/testsuite/ChangeLog * g++.dg/other/sse2-pr85572-1.C: Update expected output after my recent patc

[gcc r15-2793] testsuite: Fix recent regression of g++.dg/other/sse2-pr85572-1.C

2024-08-07 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:990a65fb1aa5d1b05a7737df879afb6900e2ce96 commit r15-2793-g990a65fb1aa5d1b05a7737df879afb6900e2ce96 Author: Roger Sayle Date: Wed Aug 7 12:52:26 2024 +0100 testsuite: Fix recent regression of g++.dg/other/sse2-pr85572-1.C My sincere apologies for not noticing

[gcc r15-2758] i386: Refactor V2DI arithmetic right shift expansion for STV.

2024-08-06 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:2f759fa9f4dd78ae8d86482ccda72a335aaac404 commit r15-2758-g2f759fa9f4dd78ae8d86482ccda72a335aaac404 Author: Roger Sayle Date: Tue Aug 6 17:19:29 2024 +0100 i386: Refactor V2DI arithmetic right shift expansion for STV. This patch refactors ashrv2di RTL

[x86_64 PATCH] Support memory destinations and wide immediate constants in STV.

2024-08-05 Thread Roger Sayle
-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-08-05 Roger Sayle gcc/ChangeLog * config/i386/i386-features.cc (timode_immed_const_gain): New function to determine the gain/cost on a CONST_

[x86_64 PATCH] Refactor V2DI arithmetic right shift expansion for STV.

2024-08-05 Thread Roger Sayle
ficial). This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-08-05 Roger Sayle gcc/ChangeLog * config/i386/i386-expand.cc (ix86_expand_v2di_ashiftrt): New

[PATCH] PR tree-optimization/57371: Optimize (float)i == 16777222.0f sometimes.

2024-07-28 Thread Roger Sayle
make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? If the testcases need to be tweaked for non-IEEE targets (the transformations themselves should be portable to VAX and IBM floating point formats) hopefully that can be done as follow-up patches

[nvptx PATCH] Implement isfinite and isnormal optabs in nvptx.md.

2024-07-27 Thread Roger Sayle
/pipermail/gcc-patches/2024-July/657881.html [which I'm sad to see is taking a while to review/get approved]. Ok for mainline? 2024-07-27 Roger Sayle gcc/ChangeLog * config/nvptx/nptx.md (UNSPEC_COPYSIGN): No longer required. (UNSPEC_ISFINITE): New UNSPEC. (

[gcc r15-2359] Fold ctz(-x) and ctz(abs(x)) as ctz(x) in match.pd.

2024-07-27 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:928116e94a5a8a995dffd926af58abfa7286e78e commit r15-2359-g928116e94a5a8a995dffd926af58abfa7286e78e Author: Roger Sayle Date: Sat Jul 27 15:16:19 2024 +0100 Fold ctz(-x) and ctz(abs(x)) as ctz(x) in match.pd. The subject line pretty much says it all; the

[match.pd PATCH] Fold ctz(-x) as ctz(x).

2024-07-23 Thread Roger Sayle
with no new failures. Ok for mainline? 2024-07-23 Roger Sayle gcc/ChangeLog * match.pd (ctz (-X) => ctz (X)): New simplification. gcc/testsuite/ChangeLog * gcc.dg/fold-ctz-1.c: New test case. Thanks in advance, Roger -- diff --git a/gcc/match.pd b/gcc/match.pd index 6818856..

[testsuite PATCH] Robustify lib/g++.exp

2024-07-22 Thread Roger Sayle
#x27;s no harm in (also) confirming that it exists in g++_include_flags. This patch has been tested on x86_64-pc-linux-gnu (where it allows a cross-compiler to arc-linux to produce g++ compilation results). Ok for mainline? 2024-07-22 Roger Sayle gcc/testsuite/ChangeLog * lib/g++.

[ARC PATCH] Improve performance of SImode right shifts (take #2)

2024-07-22 Thread Roger Sayle
opsys, is anyone able to test these changes? Thanks in advance. 2024-07-22 Roger Sayle gcc/ChangeLog * config/arc/arc-protos.h (output_rlc_loop): Prototype here. (arc_split_rlc): Prototype here. * config/arc/arc.cc (output_rlc_loop): Output a zero-overhead loop o

[gcc r15-2132] Implement a -ftrapping-math/-fsignaling-nans TODO in match.pd.

2024-07-18 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:030186cabe8128e752619e101768cf8823a42c38 commit r15-2132-g030186cabe8128e752619e101768cf8823a42c38 Author: Roger Sayle Date: Thu Jul 18 08:27:36 2024 +0100 Implement a -ftrapping-math/-fsignaling-nans TODO in match.pd. I've been investigating some (fl

[PATCH] Implement a -ftrapping-math/-fsignaling-nans TODO in match.pd.

2024-07-17 Thread Roger Sayle
e? 2024-07-17 Roger Sayle gcc/ChangeLog * match.pd ((FTYPE) N CMP CST): Only worry about exceptions with flag_trapping_math, and about signaling NaNs with HONOR_SNANS. gcc/testsuite/ChangeLog * c-c++-common/pr57371-4.c: Update comment. * c-c++-common/pr57371-5

RE: [PATCH] Use foreach, not lmap, for tcl <= 8.5 compat

2024-07-16 Thread Roger Sayle
Hi Jørgen, Awesome. Very many thanks for the speedy fix. Roger -- > -Original Message- > From: Jørgen Kvalsvik > Sent: 14 July 2024 20:46 > To: gcc-patches@gcc.gnu.org > Cc: jeffreya...@gmail.com; ro...@nextmovesoftware.com; Jørgen Kvalsvik > > Subject: [PATCH] Use foreach, not lmap,

[gcc r15-2053] PR tree-optimization/114661: Generalize MULT_EXPR recognition in match.pd.

2024-07-16 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:df9451936c6c9e4faea371e3f188e1fc6b6d39e3 commit r15-2053-gdf9451936c6c9e4faea371e3f188e1fc6b6d39e3 Author: Roger Sayle Date: Tue Jul 16 07:58:28 2024 +0100 PR tree-optimization/114661: Generalize MULT_EXPR recognition in match.pd. This patch resolves PR tree

[gcc r15-2027] i386: Tweak i386-expand.cc to restore bootstrap on RHEL.

2024-07-14 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:74e6dfb23163c2dd670d1d60fbf4c782e0b44b94 commit r15-2027-g74e6dfb23163c2dd670d1d60fbf4c782e0b44b94 Author: Roger Sayle Date: Sun Jul 14 17:22:27 2024 +0100 i386: Tweak i386-expand.cc to restore bootstrap on RHEL. This is a minor change to restore bootstrap

Re: [pushed] Add function filtering to gcov

2024-07-14 Thread Roger Sayle
I’m seeing (dejagnu) testsuite problems from this (recent) patch. Running /home/roger/GCC/patchem/gcc/testsuite/gcc.misc-tests/gcov.exp ... ERROR: (DejaGnu) proc "lmap key { snd } { if { $key in $seen } continue set key }" does not exist. The error code is NONE The info on th

[x86 PATCH] Tweak i386-expand.cc to restore bootstrap on RHEL.

2024-07-14 Thread Roger Sayle
ke bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures (from this change). Ok for mainline? 2024-07-14 Roger Sayle * config/i386/i386-expand.cc (ix86_expand_fp_absneg_operator): Use E_?Fmode enumeration constants in switch statement.

[match.pd PATCH] PR tree-optimization/114661: Generalize MULT_EXPR recognition (take #2)

2024-07-14 Thread Roger Sayle
tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-07-14 Roger Sayle Richard Biener gcc/ChangeLog PR tree-optimization/114661 * match.pd ((X*C1)|(X*C2) to

[gcc r15-2000] i386: Some AVX512 ternlog expansion refinements.

2024-07-12 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:6b5d263f2c90c3e22cdf576970c94bca268c5296 commit r15-2000-g6b5d263f2c90c3e22cdf576970c94bca268c5296 Author: Roger Sayle Date: Fri Jul 12 12:30:56 2024 +0100 i386: Some AVX512 ternlog expansion refinements. This patch replaces the calls to force_reg in

[x86 SSE PATCH] Some AVX512 ternlog expansion refinements (take #2)

2024-07-11 Thread Roger Sayle
line? 2024-07-11 Roger Sayle Hongtao Liu gcc/ChangeLog * config/i386/i386-expand.cc (ix86_broadcast_from_constant): Use CONST_VECTOR_P instead of comparison against GET_CODE. (ix86_gen_bcst_mem): Likewise. (ix86_ternlog_le

[ARC PATCH] Improve performance of SImode right shifts.

2024-07-11 Thread Roger Sayle
ns@16 cycles This patch has been minimally tested by building a cross-compiler to arc-linux hosted on x86_64-pc-linux-gnu where there are no new failures from "make -k check" in the compile-only tests. Ok for mainline (after 3rd-party testing)? 2024-07-11 Roger Sayle gcc/ChangeLog

[nvptx PATCH] Implement rtx_costs target hook for nvptx backend.

2024-07-11 Thread Roger Sayle
s 4.123190 seconds So about a 3.7x performance improvement. This patch has been tested with make and make -k check for nvptx-none hosted on x86_64-pc-linux-gnu with no new failures. Ok for mainline? 2024-07-11 Roger Sayle gcc/ChangeLog * config/nvptx/nvptx.cc (nvptx_rtx_size_costs): New f

[match.pd PATCH] PR tree-optimization/114661: Generalize MULT_EXPR recognition.

2024-07-09 Thread Roger Sayle
t This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-07-09 Roger Sayle gcc/ChangeLog PR tree-optimization/114661 * match.pd ((X*C1)|(X*C2

[x86 SSE PATCH] Some AVX512 ternlog expansion refinements.

2024-07-07 Thread Roger Sayle
} with no new failures. Ok for mainline? 2024-07-07 Roger Sayle gcc/ChangeLog * config/i386/i386-expand.cc (ix86_broadcast_from_constant): Use CONST_VECTOR_P instead of comparison against GET_CODE. (ix86_gen_bcst_mem): Likewise. (ix86_ternlog_leaf_p): Likewise

[gcc r15-1869] PR target/115751: Avoid force_reg in ix86_expand_ternlog.

2024-07-05 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:9a7e3f57e1ab8e6e4cf5ea3c0998aa50c6220579 commit r15-1869-g9a7e3f57e1ab8e6e4cf5ea3c0998aa50c6220579 Author: Roger Sayle Date: Sat Jul 6 05:24:39 2024 +0100 PR target/115751: Avoid force_reg in ix86_expand_ternlog. This patch fixes a problem with splitting of

[x86 SSE PATCH] PR target/115751: Avoid force_reg in ix86_expand_ternlog.

2024-07-04 Thread Roger Sayle
This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-07-04 Roger Sayle gcc/ChangeLog PR target/115751 * config/i386/i386-expand.c (ix86_expand_t

[gcc r15-1835] i386: Add additional variant of bswaphisi2_lowpart peephole2.

2024-07-03 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:727f8b142b7d5442af6c2e903293abc367a8de5f commit r15-1835-g727f8b142b7d5442af6c2e903293abc367a8de5f Author: Roger Sayle Date: Thu Jul 4 07:31:17 2024 +0100 i386: Add additional variant of bswaphisi2_lowpart peephole2. This patch adds an additional variation

[x86 PATCH] Add additional variant of bswaphisi2_lowpart peephole2.

2024-07-01 Thread Roger Sayle
$8, %di jmp ext This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-07-01 Roger Sayle gcc/ChangeLog * config/i386/i386.md (bswaphisi2_lowpa

[gcc r15-1752] testsuite: Fix -m32 gcc.target/i386/pr102464-vrndscaleph.c on RedHat.

2024-07-01 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:589865a8e4f6bd26c622ea0ee0a38565a0d42e80 commit r15-1752-g589865a8e4f6bd26c622ea0ee0a38565a0d42e80 Author: Roger Sayle Date: Mon Jul 1 12:21:20 2024 +0100 testsuite: Fix -m32 gcc.target/i386/pr102464-vrndscaleph.c on RedHat. This patch fixes the 4 FAILs of

[gcc r15-1751] i386: Additional peephole2 to use lea in round-up integer division.

2024-07-01 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:142b5263b18be96e5d9ce406ad2c1b6ab35c190f commit r15-1751-g142b5263b18be96e5d9ce406ad2c1b6ab35c190f Author: Roger Sayle Date: Mon Jul 1 12:18:26 2024 +0100 i386: Additional peephole2 to use lea in round-up integer division. A common idiom for implementing an

[x86 SSE PATCH] Remove legacy ternlog patterns from sse.md

2024-06-30 Thread Roger Sayle
hange. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-30 Roger Sayle gcc/ChangeLog * config/i386/sse.md (*vmov_constm1_pternlog_false_dep):

RE: [x86 PATCH]: Additional peephole2 to use lea in round-up integer division.

2024-06-30 Thread Roger Sayle
Hi Uros, > On Sat, Jun 29, 2024 at 6:21 PM Roger Sayle > wrote: > > A common idiom for implementing an integer division that rounds > > upwards is to write (x + y - 1) / y. Conveniently on x86, the two > > additions to form the numerator can be performed by a single

[testsuite PATCH] Fix -m32 gcc.target/i386/pr102464-vrndscaleph.c on RedHat.

2024-06-30 Thread Roger Sayle
ound is to define __NO_MATH_INLINES before #include (or alternatively use __builtin_floor, __builtin_ceil, etc.). This patch has been tested on x86_64-pc-linux-gnu with make -k check, with and without --target_board=unix{-m32}. Ok for mainline? 2024-06-30 Roger Sayle gcc/testsuite/ChangeLog

[x86 PATCH]: Additional peephole2 to use lea in round-up integer division.

2024-06-29 Thread Roger Sayle
inux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-29 Roger Sayle gcc/ChangeLog * config/i386/i386.md (peephole2): Transform two consecutive additions into a 3-component lea if !TARGET

[gcc r15-1702] i386: Handle sign_extend like zero_extend in *concatditi3_[346]

2024-06-27 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:07e915913b6b3d4e6e210f6dbc8e7e0e8ea594c4 commit r15-1702-g07e915913b6b3d4e6e210f6dbc8e7e0e8ea594c4 Author: Roger Sayle Date: Fri Jun 28 07:16:07 2024 +0100 i386: Handle sign_extend like zero_extend in *concatditi3_[346] This patch generalizes some of the

[gcc r15-1701] i386: Some additional AVX512 ternlog refinements.

2024-06-27 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:5938cf021e95b40b040974c9cbe7860399247f7f commit r15-1701-g5938cf021e95b40b040974c9cbe7860399247f7f Author: Roger Sayle Date: Fri Jun 28 07:12:53 2024 +0100 i386: Some additional AVX512 ternlog refinements. This patch is another round of refinements to fine

RE: nvptx vs. [PATCH] Add a late-combine pass [PR106594]

2024-06-27 Thread Roger Sayle
.@ventanamicro.com; rdapp@gmail.com; gcc-patches@gcc.gnu.org; > Tom de Vries ; Roger Sayle > Subject: Re: nvptx vs. [PATCH] Add a late-combine pass [PR106594] > > Hi! > > On 2024-06-27T22:27:21+0200, I wrote: > > On 2024-06-27T18:49:17+0200, I wrote: > >> On 2023-10-

[x86 PATCH] Handle sign_extend like zero_extend in *concatditi3_[346]

2024-06-27 Thread Roger Sayle
with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-27 Roger Sayle gcc/ChangeLog * config/i386/i386.md (*concat3_3): Change zero_extend to any_extend in first operand to left shift by mode precision. (*concat3_4): Likewise.

[x86 SSE PATCH] Some additional ternlog refinements.

2024-06-27 Thread Roger Sayle
ently use decimal. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-27 Roger Sayle gcc/ChangeLog * config/i386/i386-expand.cc (ix86_ternlo

[gcc r15-1584] PR tree-optimization/113673: Avoid load merging when potentially trapping.

2024-06-24 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:d8b05aef77443e1d3d8f3f5d2c56ac49a503fee3 commit r15-1584-gd8b05aef77443e1d3d8f3f5d2c56ac49a503fee3 Author: Roger Sayle Date: Mon Jun 24 15:34:03 2024 +0100 PR tree-optimization/113673: Avoid load merging when potentially trapping. This patch fixes PR tree

[ARC PATCH] Improved SImode conditional moves (improves DImode shifts).

2024-06-22 Thread Roger Sayle
sue is also described at https://github.com/foss-for-synopsys-dwc-arc-processors/gcc/issues/110 Tested with a cross-compiler to arc-linux hosted on x86_64, with no new (compile-only) regressions from make -k check. Ok for mainline if this passes Claudiu's and/or Jeff's testing? 20

[PATCH v2] PR tree-opt/113673: Avoid load merging when potentially trapping.

2024-06-21 Thread Roger Sayle
ke bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-21 Roger Sayle Richard Biener gcc/ChangeLog PR tree-optimization/113673 * gimple-ssa-store-merging.cc (find_bswap_or_nop_lo

[gcc r15-1502] i386: Allow all register_operand SUBREGs in x86_ternlog_idx.

2024-06-20 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:9a76db24e044c8058497051a652cca4228cbc8e9 commit r15-1502-g9a76db24e044c8058497051a652cca4228cbc8e9 Author: Roger Sayle Date: Thu Jun 20 16:30:15 2024 +0100 i386: Allow all register_operand SUBREGs in x86_ternlog_idx. This patch tweaks ix86_ternlog_idx to

[x86 PATCH] Allow all register_operand SUBREGs in x86_ternlog_idx.

2024-06-18 Thread Roger Sayle
ode V4SF. This patch allows the recently added ternlog_operand to accept this case. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-18 Roger Sayle gcc/C

[gcc r15-1306] i386: More use of m{32, 64}bcst addressing modes with ternlog.

2024-06-13 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:c129a34dc8e69f7b34cf72835aeba2cefbb8673a commit r15-1306-gc129a34dc8e69f7b34cf72835aeba2cefbb8673a Author: Roger Sayle Date: Fri Jun 14 06:29:27 2024 +0100 i386: More use of m{32,64}bcst addressing modes with ternlog. This patch makes more use of m32bcst and

[x86 PATCH] More use of m{32,64}bcst addressing modes with ternlog.

2024-06-12 Thread Roger Sayle
ret// 1 = 42 total This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-12 Roger Sayle gcc/ChangeLog * config/i386/i38

[gcc r15-1175] i386: PR target/115397: AVX512 ternlog vs. -m32 -fPIC constant pool.

2024-06-11 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:a797398cfbc75899fdb7d97436c0c89c02b133c0 commit r15-1175-ga797398cfbc75899fdb7d97436c0c89c02b133c0 Author: Roger Sayle Date: Tue Jun 11 09:31:34 2024 +0100 i386: PR target/115397: AVX512 ternlog vs. -m32 -fPIC constant pool. This patch fixes PR target/115397

[x86 PATCH] PR target/115397: AVX512 ternlog vs. -m32 -fPIC constant pool.

2024-06-10 Thread Roger Sayle
x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-10 Roger Sayle gcc/ChangeLog PR target/115397 * config/i386/i386-expand.cc (ix86_expand_te

[gcc r15-1111] analyzer: Restore g++ 4.8 bootstrap; use std::move to return std::unique_ptr.

2024-06-07 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:e22b7f741ab54ff3a3f8a676ce9e7414fe174958 commit r15--ge22b7f741ab54ff3a3f8a676ce9e7414fe174958 Author: Roger Sayle Date: Sat Jun 8 05:01:38 2024 +0100 analyzer: Restore g++ 4.8 bootstrap; use std::move to return std::unique_ptr. This patch restores

[analyzer PATCH] Restore bootstrap with g++ 4.8.

2024-06-07 Thread Roger Sayle
using "scl enable devetoolset-10") as host compilers. Ok for mainline? 2024-06-07 Roger Sayle gcc/analyzer/ChangeLog * constraint-manager.cc (equiv_class::make_dump_widget): Use std::move to return a std::unique_ptr. (bounded_ranges_constraint::make_dump_wi

[gcc r15-1101] i386: PR target/115351: RTX costs for *concatditi3 and *insvti_highpart.

2024-06-07 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:fb3e4c549d16d5050e10114439ad77149f33c597 commit r15-1101-gfb3e4c549d16d5050e10114439ad77149f33c597 Author: Roger Sayle Date: Fri Jun 7 14:03:20 2024 +0100 i386: PR target/115351: RTX costs for *concatditi3 and *insvti_highpart. This patch addresses PR target

[gcc r15-1100] i386: Improve handling of ternlog instructions in i386/sse.md

2024-06-07 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:ec985bc97a01577bca8307f986caba7ba7633cde commit r15-1100-gec985bc97a01577bca8307f986caba7ba7633cde Author: Roger Sayle Date: Fri Jun 7 13:57:23 2024 +0100 i386: Improve handling of ternlog instructions in i386/sse.md This patch improves the way that the x86

[x86 PATCH] PR target/115351: RTX costs for *concatditi3 and *insvti_highpart.

2024-06-07 Thread Roger Sayle
e? 2024-06-07 Roger Sayle gcc/ChangeLog PR target/115351 * config/i386/i386.cc (ix86_rtx_costs): Provide estimates for the *concatditi3 and *insvti_highpart patterns, about two insns. gcc/testsuite/ChangeLog PR target/115351 * g++.target/i386/pr1153

[gcc r15-775] i386: Correct insn_cost of movabsq.

2024-05-22 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:a3b16e73a2d5b2d4d20ef6f2fd164cea633bbec8 commit r15-775-ga3b16e73a2d5b2d4d20ef6f2fd164cea633bbec8 Author: Roger Sayle Date: Wed May 22 16:45:48 2024 +0100 i386: Correct insn_cost of movabsq. This single line patch fixes a strange quirk/glitch in i386&#

[x86_64 PATCH] Correct insn_cost of movabsq.

2024-05-22 Thread Roger Sayle
e -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-05-22 Roger Sayle gcc/ChangeLog * config/i386/i386.cc (ix86_rtx_costs) : A CONST_INT that isn't x86_64_immediate_operand requires an extra (expensive) movabsq in

[gcc r15-774] Avoid ICE in except.cc on targets that don't support exceptions.

2024-05-22 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:26df7b4684e201e66c09dd018603a248ddc5f437 commit r15-774-g26df7b4684e201e66c09dd018603a248ddc5f437 Author: Roger Sayle Date: Wed May 22 13:48:52 2024 +0100 Avoid ICE in except.cc on targets that don't support exceptions. A number of testcases currently

[PATCH] Avoid ICE in except.cc on targets that don't support exceptions.

2024-05-22 Thread Roger Sayle
This patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu with no new failures in the testsuite, and ~220 fewer FAILs. Ok for mainline? 2024-05-22 Roger Sayle gcc/ChangeLog * except.cc (output_function_exception_table): Move call to get_personality

[gcc r15-648] nvptx: Correct pattern for popcountdi2 insn in nvptx.md.

2024-05-19 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:1676ef6e91b902f592270e4bcf10b4fc342e200d commit r15-648-g1676ef6e91b902f592270e4bcf10b4fc342e200d Author: Roger Sayle Date: Sun May 19 09:49:45 2024 +0100 nvptx: Correct pattern for popcountdi2 insn in nvptx.md. The result of a POPCOUNT operation in RTL

[x86 SSE] Improve handling of ternlog instructions in i386/sse.md (v2)

2024-05-17 Thread Roger Sayle
64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-05-17 Roger Sayle Hongtao Liu gcc/ChangeLog PR target/115021 * config/i386/i386-expand.cc (ix86_expand

[x86 SSE] Improve handling of ternlog instructions in i386/sse.md

2024-05-12 Thread Roger Sayle
inux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-05-12 Roger Sayle gcc/ChangeLog PR target/115021 * config/i386/i386-expand.cc (ix86_expand_args_builtin): Call fixup_modeless_co

[gcc r15-390] arm: Use utxb rN, rM, ror #8 to implement zero_extract on armv6.

2024-05-12 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:46077992180d6d86c86544df5e8cb943492d3b01 commit r15-390-g46077992180d6d86c86544df5e8cb943492d3b01 Author: Roger Sayle Date: Sun May 12 16:27:22 2024 +0100 arm: Use utxb rN, rM, ror #8 to implement zero_extract on armv6. Examining the code generated for the

[gcc r15-366] i386: Improve V[48]QI shifts on AVX512/SSE4.1

2024-05-10 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:f5a8cdc1ef5d6aa2de60849c23658ac5298df7bb commit r15-366-gf5a8cdc1ef5d6aa2de60849c23658ac5298df7bb Author: Roger Sayle Date: Fri May 10 20:26:40 2024 +0100 i386: Improve V[48]QI shifts on AVX512/SSE4.1 The following one line patch improves the code generated

Re: [x86 PATCH] Improve V[48]QI shifts on AVX512

2024-05-10 Thread Roger Sayle
his weekend. Thanks again, Roger > From: Hongtao Liu > On Fri, May 10, 2024 at 6:26 AM Roger Sayle > wrote: > > > > > > The following one line patch improves the code generated for V8QI and > > V4QI shifts when AV512BW and AVX512VL functionality is available. &

[x86 PATCH] Improve V[48]QI shifts on AVX512

2024-05-09 Thread Roger Sayle
ch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-05-09 Roger Sayle gcc/ChangeLog * config/i386/i386-expand.cc (ix86_expand_vecop_qihi_partial): Don

[gcc r15-352] Constant fold {-1,-1} << 1 in simplify-rtx.cc

2024-05-09 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:f2449b55fb2d32fc4200667ba79847db31f6530d commit r15-352-gf2449b55fb2d32fc4200667ba79847db31f6530d Author: Roger Sayle Date: Thu May 9 22:45:54 2024 +0100 Constant fold {-1,-1} << 1 in simplify-rtx.cc This patch addresses a missed optimization opportun

[gcc r15-222] PR target/106060: Improved SSE vector constant materialization on x86.

2024-05-06 Thread Roger Sayle via Gcc-cvs
https://gcc.gnu.org/g:79649a5dcd81bc05c0ba591068c9075de43bd417 commit r15-222-g79649a5dcd81bc05c0ba591068c9075de43bd417 Author: Roger Sayle Date: Tue May 7 07:14:40 2024 +0100 PR target/106060: Improved SSE vector constant materialization on x86. This patch resolves PR target

RE: [PATCH] PR middle-end/111701: signbit(x*x) vs -fsignaling-nans

2024-05-02 Thread Roger Sayle
> From: Richard Biener > On Thu, May 2, 2024 at 11:34 AM Roger Sayle > wrote: > > > > > > > From: Richard Biener On Fri, Apr 26, > > > 2024 at 10:19 AM Roger Sayle > > > wrote: > > > > > > > > This patch address

RE: [PATCH] PR middle-end/111701: signbit(x*x) vs -fsignaling-nans

2024-05-02 Thread Roger Sayle
> From: Richard Biener > On Fri, Apr 26, 2024 at 10:19 AM Roger Sayle > wrote: > > > > This patch addresses PR middle-end/111701 where optimization of > > signbit(x*x) using tree_nonnegative_p incorrectly eliminates a > > floating point multiplication whe

RE: [C PATCH] PR c/109618: ICE-after-error from error_mark_node.

2024-04-30 Thread Roger Sayle
> On Tue, Apr 30, 2024 at 10:23 AM Roger Sayle > wrote: > > Hi Richard, > > Thanks for looking into this. > > > > It’s not the call to size_binop_loc (for CEIL_DIV_EXPR) that's > > problematic, but the call to fold_convert_loc (loc, size_type_node,

RE: [C PATCH] PR c/109618: ICE-after-error from error_mark_node.

2024-04-30 Thread Roger Sayle
which does more of a tree traversal checking error_operand_p within the unary and binary operators of an expression tree. Please let me know what you think/recommend. Best regards, Roger -- > -Original Message- > From: Richard Biener > Sent: 30 April 2024 08:38 > To: Roger Sayle >

[C PATCH] PR c/109618: ICE-after-error from error_mark_node.

2024-04-29 Thread Roger Sayle
ng away) a CEIL_DIV_EXPR in the common case that "char" is a single-byte. The current code relies on the middle-end's tree folding to recognize that CEIL_DIV_EXPR of integer_one_node is a no-op, that can be optimized away. Ok for mainline? 2024-04-30 Roger Sayle gcc/c-family/Chan

[PATCH] PR tree-opt/113673: Avoid load merging from potentially trapping additions.

2024-04-28 Thread Roger Sayle
updating the CFG is a part of the compiler that I'm less familiar with. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-04-28 Roger Sayle gcc/ChangeL

[PATCH] PR middle-end/111701: signbit(x*x) vs -fsignaling-nans

2024-04-26 Thread Roger Sayle
c-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-04-26 Roger Sayle gcc/ChangeLog PR middle-end/111701 * fold-const.cc (tree_binary_nonnegative_warnv_p) : Split handling of flo

[PATCH] PR target/114187: Fix ?Fmode SUBREG simplification in simplify_subreg.

2024-03-03 Thread Roger Sayle
added/modified potentially contributed to this lapse. Using lowpart_subreg should avoid/reduce confusion in future. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for ma

[x86_64 PATCH] PR target/113690: Fix-up MULT REG_EQUAL notes in STV.

2024-02-04 Thread Roger Sayle
64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-02-05 Roger Sayle gcc/ChangeLog PR target/113690 * config/i386/i386-features.cc (timode_convert_cst): New helper functi

[tree-ssa PATCH] PR target/113560: Enhance is_widening_mult_rhs_p.

2024-01-29 Thread Roger Sayle
ootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2023-01-30 Roger Sayle gcc/ChangeLog PR target/113560 * tree-ssa-math-opts.cc (is_widening_mult_rhs_p): Use range information via tree_non_zero_bits to check i

[libatomic PATCH] PR other/113336: Fix libatomic testsuite regressions on ARM.

2024-01-28 Thread Roger Sayle
This patch is a revised version of the fix for PR other/113336. This patch has been tested on arm-linux-gnueabihf with --with-arch=armv6 with make bootstrap and make -k check where it fixes all of the FAILs in libatomic. Ok for mainline? 2024-01-28 Roger Sayle Victor Do

[middle-end PATCH] Constant fold {-1,-1} << 1 in simplify-rtx.cc

2024-01-26 Thread Roger Sayle
n now checks that VEC_SELECT or some funky (future) rtx_code doesn't cause problems. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline (in stage 1)? 2024-01-26 Roger Sa

RE: [x86 PATCH] PR target/106060: Improved SSE vector constant materialization.

2024-01-25 Thread Roger Sayle
no new failures. Ok for mainline (in stage 1)? 2024-01-25 Roger Sayle Hongtao Liu gcc/ChangeLog PR target/106060 * config/i386/i386-expand.cc (enum ix86_vec_bcast_alg): New. (struct ix86_vec_bcast_map_simode_t): New type for table below. (ix86_vec

RE: [middle-end PATCH] Prefer PLUS over IOR in RTL expansion of multi-word shifts/rotates.

2024-01-19 Thread Roger Sayle
-level might lead to a code quality regression, if RTL expansion doesn't know to lower it back to use PLUS on those targets with lea but without rotate. > From: Richard Biener > Sent: 19 January 2024 11:04 > On Thu, Jan 18, 2024 at 8:55 PM Roger Sayle > wrote: > > > > T

[middle-end PATCH] Prefer PLUS over IOR in RTL expansion of multi-word shifts/rotates.

2024-01-18 Thread Roger Sayle
add2r1,r2,r1 j_s.d [blink] add2r0,r3,r0 This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-01-18 Roger Sayle gcc/ChangeLog

[x86 PATCH] PR target/106060: Improved SSE vector constant materialization.

2024-01-16 Thread Roger Sayle
gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-01-16 Roger Sayle gcc/ChangeLog PR target/106060 * config/i386/i386-expand.cc (enum ix86_vec_bcast_alg): New. (struct ix86_vec_bcas

  1   2   3   4   5   6   7   8   >