[x86 PATCH] Improved TImode (128-bit) integer constants on x86_64.

2023-12-18 Thread Roger Sayle
oard=unix{-m32}, and with/without -march=cascadelake with no new failures. Ok for mainline? 2023-12-18 Roger Sayle gcc/ChangeLog * config/i386/i386-expand.cc (ix86_convert_const_wide_int_to_broadcast): Remove static. (ix86_expand_move): Don't attempt t

[x86_PATCH] peephole2 to resolve failure of gcc.target/i386/pr43644-2.c

2023-12-22 Thread Roger Sayle
, %rdx ret which I believe is optimal. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2023-12-21 Roger Sayle gcc/ChangeLog PR target/43644

[x86_64 PATCH] PR target/112992: Optimize mode for broadcast of constants.

2023-12-22 Thread Roger Sayle
rap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2023-12-21 Roger Sayle gcc/ChangeLog PR target/112992 * config/i386/i386-expand.cc (ix86_convert_const_wide_int_to_broadcast)

[ARC PATCH] Table-driven ashlsi implementation for better code/rtx_costs.

2023-12-23 Thread Roger Sayle
j_s [blink] Tested with a cross-compiler to arc-linux hosted on x86_64, with no new (compile-only) regressions from make -k check. Ok for mainline if this passes Claudiu's and/or Jeff's testing? [Thanks again to Jeff for finding the typo in my last ARC patch] 2023-12-23 Roger Sa

Re: [PATCH v3] EXPR: Emit an truncate if 31+ bits polluted for SImode

2023-12-23 Thread Roger Sayle
Hi YunQiang (and Jeff), > MIPS claims TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true > based on that the hard register is always sign-extended, but here > the hard register is polluted by zero_extract. I suspect that the bug here is that the MIPS backend shouldn't be returning true for

RE: Re: [PATCH v3] EXPR: Emit an truncate if 31+ bits polluted for SImode

2023-12-23 Thread Roger Sayle
> There's a PR in Bugzilla around this representational issue on MIPS, but I can't find > it straight away. Found it. It's PR rtl-optimization/104914, where we've already discussed this in comments #15 and #16. > -Original Message- > From: Roger Sayle

RE: [PATCH v3] EXPR: Emit an truncate if 31+ bits polluted for SImode

2023-12-24 Thread Roger Sayle
> What's exceedingly weird is T_N_T_M_P (DImode, SImode) isn't actually a > truncation! The output precision is first, the input precision is second. > The docs > explicitly state the output precision should be smaller than the input > precision > (which makes sense for truncation). > > That

RE: [PATCH v3] EXPR: Emit an truncate if 31+ bits polluted for SImode

2023-12-24 Thread Roger Sayle
> > > What's exceedingly weird is T_N_T_M_P (DImode, SImode) isn't > > > actually a truncation! The output precision is first, the input > > > precision is second. The docs explicitly state the output precision > > > should be smaller than the input precision (which makes sense for > > > trunc

[PATCH] Improved RTL expansion of field assignments into promoted registers.

2023-12-28 Thread Roger Sayle
erate much better code for the above test case. Ok for mainline? 2023-12-28 Roger Sayle gcc/ChangeLog PR rtl-optimization/104914 * expr.cc (expand_assignment): When target is SUBREG_PROMOTED_VAR_P a sign or zero extension is only required if the modified f

[middle-end PATCH] Only call targetm.truly_noop_truncation for truncations.

2023-12-28 Thread Roger Sayle
ddle-end that rely on the default behaviour of silently returning true for any (invalid) input. These are fixed below. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainli

[PATCH] MIPS: Implement TARGET_INSN_COSTS

2023-12-28 Thread Roger Sayle
The current (default) behavior is that when the target doesn't define TARGET_INSN_COST the middle-end uses the backend's TARGET_RTX_COSTS, so multiplications are slower than additions, but about the same size when optimizing for size (with -Os or -Oz). All of this gets disabled with your

RE: [PATCH] Improved RTL expansion of field assignments into promoted registers.

2023-12-28 Thread Roger Sayle
Hi Jeff, Thanks for the speedy review. > On 12/28/23 07:59, Roger Sayle wrote: > > This patch fixes PR rtl-optmization/104914 by tweaking/improving the > > way that fields are written into a pseudo register that needs to be > > kept sign extended. > Well, I think "

RE: [x86_PATCH] peephole2 to resolve failure of gcc.target/i386/pr43644-2.c

2023-12-31 Thread Roger Sayle
Hi Uros, > From: Uros Bizjak > Sent: 28 December 2023 10:33 > On Fri, Dec 22, 2023 at 11:14 AM Roger Sayle > wrote: > > > > This patch resolves the failure of pr43644-2.c in the testsuite, a > > code quality test I added back in July, that started failing as the

[middle-end PATCH take #2] Only call targetm.truly_noop_truncation for truncations.

2023-12-31 Thread Roger Sayle
h has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? Hopefully this revision tests cleanly on the linaro.org CI pipeline. 2023-12-31 Roger Sayle gcc/ChangeLog * combine

[x86 PATCH] PR target/113231: Improved costs in Scalar-To-Vector (STV) pass.

2024-01-06 Thread Roger Sayle
6_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-01-06 Roger Sayle gcc/ChangeLog PR target/113231 * config/i386/i386-features.cc (compute_convert_gain): Include the ove

RE: [x86_64 PATCH] PR target/112992: Optimize mode for broadcast of constants.

2024-01-06 Thread Roger Sayle
pr102021.c: Likewise. * gcc.target/i386/pr90773-17.c: Likewise. Thanks in advance. Roger -- > -Original Message- > From: Hongtao Liu > Sent: 02 January 2024 05:40 > To: Roger Sayle > Cc: gcc-patches@gcc.gnu.org; Uros Bizjak > Subject: Re: [x86_64 PATCH] PR target/112992: Opti

[libatomic PATCH] Fix testsuite regressions on ARM [raspberry pi].

2024-01-08 Thread Roger Sayle
unresolved testcases]. If this looks like the correct fix, I'm not confident with rebuilding Makefile.in with correct version of automake, so I'd very much appreciate it if someone/the reviewer/mainainer could please check this in for me. Thanks in advance. 2024-01-08 Roger Sayle

[RISC-V PATCH] Improve style to work around PR 60994 in host compiler.

2023-12-01 Thread Roger Sayle
-linux-gnu using g++ 4.8.5 as the host compiler. Ok for mainline? 2023-12-01 Roger Sayle gcc/ChangeLog * config/riscv/riscv-vsetvl.cc (csetvl_info::parse_insn): Rename local variable from demand_flags to dflags, to avoid conflicting with (enumeration) type of the same name.

[PATCH] Workaround array_slice constructor portability issues (with older g++).

2023-12-03 Thread Roger Sayle
draws attention to the problem and restores bootstrap whilst better approaches are investigated. For example, an ARRAY_SLICE(table) macro might be appropriate if there isn't an easy/portable template resolution solution. Thoughts? 2023-12-03 Roger Sayle gcc/c-family/ChangeLog

[ARC PATCH] Add *extvsi_n_0 define_insn_and_split for PR 110717.

2023-12-05 Thread Roger Sayle
ons from make -k check. Ok for mainline if this passes Claudiu's nightly testing? 2023-12-05 Roger Sayle gcc/ChangeLog * config/arc/arc.md (*extvsi_n_0): New define_insn_and_split to implement SImode sign extract using a AND, XOR and MINUS sequence. gcc/testsuite/C

RE: [ARC PATCH] Add *extvsi_n_0 define_insn_and_split for PR 110717.

2023-12-07 Thread Roger Sayle
ombine doesn't (normally) like turning two instructions into three. Fingers-crossed the attached patch works better on the nightly testers. Thanks in advance, Roger -- > -Original Message- > From: Jeff Law > Sent: 07 December 2023 14:47 > To: Roger Sayle ; gcc-patches@gcc.gn

[PING] PR112380: Defend against CLOBBERs in RTX expressions in combine.cc

2023-12-10 Thread Roger Sayle
to continue exploring alternate simplifications would also lead to better code generation, but I've not been able to find any examples on x86_64. This patch has been retested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no ne

[x86 PATCH] PR target/106933: Limit TImode STV to SSA-like def-use chains.

2022-12-22 Thread Roger Sayle
atch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2022-12-22 H.J. Lu Roger Sayle gcc/ChangeLog PR target/106933 PR target/106959 * config

[x86 PATCH] PR target/107548: Handle vec_select in STV.

2022-12-22 Thread Roger Sayle
patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2022-12-22 Roger Sayle gcc/ChangeLog PR target/107548 * config/i386/i386-features.cc (scalar_chain::add_

[x86 PATCH] Use movss/movsd to implement V4SI/V2DI VEC_PERM.

2022-12-23 Thread Roger Sayle
-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2022-12-23 Roger Sayle gcc/ChangeLog * config/i386/i386-expand.cc (expand_vec_perm_movs): Also allow V4SImode with TARGET_SSE and V2D

[Committed] Tweak new gcc.target/i386/pr107548-1.c for -march=cascadelake.

2022-12-24 Thread Roger Sayle
My recently added testcases gcc.target/i386/pr107548-[12].c need to be tweaked slightly for -march=cascadelake. Committed as obvious. 2022-12-24 Roger Sayle gcc/testsuite/ChangeLog PR target/107548 * gcc.target/i386/pr107548-1.c: Match both vmovd and movd

RE: [x86 PATCH] Use movss/movsd to implement V4SI/V2DI VEC_PERM.

2022-12-25 Thread Roger Sayle
Hi Uros, Many thanks and merry Christmas. Here's the version as committed, implemented using your preferred idiom with mode iterators for movss/movsd. Thanks again. 2022-12-25 Roger Sayle Uroš Bizjak gcc/ChangeLog * config/i386/i386-builtin.def (__builtin_ia32_

[x86 PATCH] Use ix86_expand_clear in ix86_split_ashl.

2022-12-27 Thread Roger Sayle
2022-12-28 Roger Sayle gcc/ChangeLog * config/i386/i386-expand.cc (ix86_split_ashl): Call ix86_expand_clear to generate an xor instruction. gcc/testsuite/ChangeLog * gcc.target/i386/ashlti3-1.c: New test case. Thanks in advance, Roger -- diff --git a/gcc/confi

[x86_64 PATCH] Add post-reload splitter for extendditi2.

2022-12-27 Thread Roger Sayle
ret i.e. the same code for the signed and unsigned extension variants. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2022-12-28 Roger Sayle gcc/ChangeLog

[x86 PATCH] Provide zero_extend versions/variants of several patterns.

2022-12-27 Thread Roger Sayle
definitions remain the same, it's just the expected RTL is slightly different but equivalent. Providing both forms makes the backend more robust to middle-end changes [and possibly catches some missed optimizations]. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -

RE: [x86 PATCH] Provide zero_extend versions/variants of several patterns.

2022-12-28 Thread Roger Sayle
Hi Uros, Many thanks for your reviews. > On Wed, Dec 28, 2022 at 2:15 AM Roger Sayle > wrote: > > > > > > Back in September, the review of my patch for PR > > rtl-optimization/106594, > > https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601501.html

RE: [x86_64 PATCH] Add post-reload splitter for extendditi2.

2023-01-01 Thread Roger Sayle
64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2023-01-01 Roger Sayle Uroš Bizjak gcc/ChangeLog * config/i386/i386.md (extendditi2): New define_insn. (define_split)

[PATCH] Fix RTL simplifications of FFS, POPCOUNT and PARITY.

2023-01-01 Thread Roger Sayle
x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2023-01-01 Roger Sayle gcc/ChangeLog * gcc/simplify-rtx.cc (simplify_unary_operation_1) : Avoid generating FFS with mismatched o

[x86 PATCH] PR target/108229: A minor STV compute_convert_gain tweak.

2023-01-01 Thread Roger Sayle
arameterization (and it's dangerous to select parameters from the N=1 statistics of a single bugzilla PR). This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2023-01-01

[x86 PATCH] Improve ix86_expand_int_movcc to allow condition (mask) sharing.

2023-01-02 Thread Roger Sayle
-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2023-01-02 Roger Sayle gcc/ChangeLog * config/i386/i386-expand.cc (ix86_expand_int_movcc): Rewrite RTL expansion to allow condition (mask

[PATCH] PR tree-optimization/92342: Optimize b & -(a==c) in match.pd

2023-01-03 Thread Roger Sayle
ed on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2023-01-03 Andrew Pinski Roger Sayle gcc/ChangeLog: PR tree-optimization/92342 * match.pd ((m1 CMP m2) * d ->

[x86_64 PATCH] Introduce insvti_highpart define_insn_and_split.

2023-01-05 Thread Roger Sayle
nline? Please let me know if you'd prefer a different pattern name [insv seemed better than mov]. 2023-01-05 Roger Sayle gcc/ChangeLog * config/i386/i386.md (any_or_plus): Move definition earlier. (*insvti_highpart_1): New define_insn_and_split to overwrite

[Committed] PR rtl-optimization/108292: Revert "Improve ix86_expand_int_movcc to allow condition (mask) sharing"

2023-01-05 Thread Roger Sayle
I agree with Uros that it's best to revert my recent patch that caused PR rtl-optimization/108292. Sorry for the inconvenience. This reverts commit d0558f420b2a5692fd38ac76ffa97ae6c1726ed9. 2023-01-05 Roger Sayle gcc/ChangeLog PR rtl-optimization/108292 * config/i386

[nvptx PATCH] Correct pattern for popcountdi2 insn in nvptx.md.

2023-01-09 Thread Roger Sayle
). This patch has been tested on nvptx-none (hosted on x86_64-pc-linux-gnu) with make and make -k check with no new failures. This functionality is already tested by gcc.target/nvptx/popc-[123].c. Ok for mainline? 2023-01-09 Roger Sayle gcc/ChangeLog * config/nvptx/nvptx.md (popcount2

[x86 PATCH] PR rtl-optimization/107991: peephole2 to tweak register allocation.

2023-01-09 Thread Roger Sayle
movl%esi, %eax subl%edx, %esi testb %dil, %dil cmovne %esi, %eax ret This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainlin

[PATCH] PR rtl-optimization/106421: ICE in bypass_block from non-local goto.

2023-01-09 Thread Roger Sayle
check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2023-01-09 Roger Sayle gcc/ChangeLog PR rtl-optimization/106421 * cprop.cc (bypass_block): Check that DEST is local to this function (non-NULL) before calling find_edge.

RE: [x86 PATCH] PR rtl-optimization/107991: peephole2 to tweak register allocation.

2023-01-10 Thread Roger Sayle
ps. Cheers, Roger -- > -Original Message- > From: Richard Sandiford > Sent: 10 January 2023 10:48 > To: Uros Bizjak > Cc: GCC Patches ; Roger Sayle > > Subject: Re: [x86 PATCH] PR rtl-optimization/107991: peephole2 to tweak > register allocation. > > Uros B

RE: [x86_64 PATCH] Improve __int128 argument passing (in ix86_expand_move).

2023-09-01 Thread Roger Sayle
ciently as we were before. As you/clang show, we could do better. Thanks again, and sorry for any inconvenience. Best regards, Roger -- > -Original Message- > From: Manolis Tsamis > Sent: 01 September 2023 11:45 > To: Uros Bizjak > Cc: Roger Sayle ; gcc-patches@gcc.

[x86 PATCH] Improve reg pressure of double-word right-shift then truncate.

2023-11-12 Thread Roger Sayle
-m32} with no new failures. Ok for mainline? 2023-11-12 Roger Sayle gcc/ChangeLog * config/i386/i386.md (3_doubleword_lowpart): New define_insn_and_split to optimize register usage of doubleword right shifts followed by truncation. Thanks in advance, Roger -- diff --

[PATCH] PR112380: Defend against CLOBBERs in RTX expressions in combine.cc

2023-11-12 Thread Roger Sayle
ing through the fall-out sufficient for x86_64 to bootstrap and regression test without new failures. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2023-11-12

[ARC PATCH] Split SImode shifts pre-reload on !TARGET_BARREL_SHIFTER.

2023-09-28 Thread Roger Sayle
lsr_s r1,r1 2: # end single insn loop j_s.d[blink] or_s r0,r0,r1 Thanks in advance, Roger 2023-09-28 Roger Sayle gcc/ChangeLog * config/arc/arc-protos.h (emit_shift): Delete prototype. (arc_pre_reload_split): New function prototype. * config/arc/a

[x86 SSE PATCH] Some additional ternlog refinements.

2024-06-27 Thread Roger Sayle
ently use decimal. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-27 Roger Sayle gcc/ChangeLog * config/i386/i386-expand.cc (ix86_ternlo

[x86 PATCH] Handle sign_extend like zero_extend in *concatditi3_[346]

2024-06-27 Thread Roger Sayle
with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-27 Roger Sayle gcc/ChangeLog * config/i386/i386.md (*concat3_3): Change zero_extend to any_extend in first operand to left shift by mode precision. (*concat3_4): Likewise.

RE: nvptx vs. [PATCH] Add a late-combine pass [PR106594]

2024-06-27 Thread Roger Sayle
.@ventanamicro.com; rdapp@gmail.com; gcc-patches@gcc.gnu.org; > Tom de Vries ; Roger Sayle > Subject: Re: nvptx vs. [PATCH] Add a late-combine pass [PR106594] > > Hi! > > On 2024-06-27T22:27:21+0200, I wrote: > > On 2024-06-27T18:49:17+0200, I wrote: > >> On 2023-10-

[x86 PATCH]: Additional peephole2 to use lea in round-up integer division.

2024-06-29 Thread Roger Sayle
inux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-29 Roger Sayle gcc/ChangeLog * config/i386/i386.md (peephole2): Transform two consecutive additions into a 3-component lea if !TARGET

[testsuite PATCH] Fix -m32 gcc.target/i386/pr102464-vrndscaleph.c on RedHat.

2024-06-30 Thread Roger Sayle
ound is to define __NO_MATH_INLINES before #include (or alternatively use __builtin_floor, __builtin_ceil, etc.). This patch has been tested on x86_64-pc-linux-gnu with make -k check, with and without --target_board=unix{-m32}. Ok for mainline? 2024-06-30 Roger Sayle gcc/testsuite/ChangeLog

RE: [x86 PATCH]: Additional peephole2 to use lea in round-up integer division.

2024-06-30 Thread Roger Sayle
Hi Uros, > On Sat, Jun 29, 2024 at 6:21 PM Roger Sayle > wrote: > > A common idiom for implementing an integer division that rounds > > upwards is to write (x + y - 1) / y. Conveniently on x86, the two > > additions to form the numerator can be performed by a single

[x86 SSE PATCH] Remove legacy ternlog patterns from sse.md

2024-06-30 Thread Roger Sayle
hange. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-30 Roger Sayle gcc/ChangeLog * config/i386/sse.md (*vmov_constm1_pternlog_false_dep):

[x86 PATCH] Add additional variant of bswaphisi2_lowpart peephole2.

2024-07-01 Thread Roger Sayle
$8, %di jmp ext This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-07-01 Roger Sayle gcc/ChangeLog * config/i386/i386.md (bswaphisi2_lowpa

[x86 SSE PATCH] PR target/115751: Avoid force_reg in ix86_expand_ternlog.

2024-07-04 Thread Roger Sayle
This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-07-04 Roger Sayle gcc/ChangeLog PR target/115751 * config/i386/i386-expand.c (ix86_expand_t

[x86 SSE PATCH] Some AVX512 ternlog expansion refinements.

2024-07-07 Thread Roger Sayle
} with no new failures. Ok for mainline? 2024-07-07 Roger Sayle gcc/ChangeLog * config/i386/i386-expand.cc (ix86_broadcast_from_constant): Use CONST_VECTOR_P instead of comparison against GET_CODE. (ix86_gen_bcst_mem): Likewise. (ix86_ternlog_leaf_p): Likewise

[match.pd PATCH] PR tree-optimization/114661: Generalize MULT_EXPR recognition.

2024-07-09 Thread Roger Sayle
t This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-07-09 Roger Sayle gcc/ChangeLog PR tree-optimization/114661 * match.pd ((X*C1)|(X*C2

[nvptx PATCH] Implement rtx_costs target hook for nvptx backend.

2024-07-11 Thread Roger Sayle
s 4.123190 seconds So about a 3.7x performance improvement. This patch has been tested with make and make -k check for nvptx-none hosted on x86_64-pc-linux-gnu with no new failures. Ok for mainline? 2024-07-11 Roger Sayle gcc/ChangeLog * config/nvptx/nvptx.cc (nvptx_rtx_size_costs): New f

[ARC PATCH] Improve performance of SImode right shifts.

2024-07-11 Thread Roger Sayle
ns@16 cycles This patch has been minimally tested by building a cross-compiler to arc-linux hosted on x86_64-pc-linux-gnu where there are no new failures from "make -k check" in the compile-only tests. Ok for mainline (after 3rd-party testing)? 2024-07-11 Roger Sayle gcc/ChangeLog

[x86 SSE PATCH] Some AVX512 ternlog expansion refinements (take #2)

2024-07-11 Thread Roger Sayle
line? 2024-07-11 Roger Sayle Hongtao Liu gcc/ChangeLog * config/i386/i386-expand.cc (ix86_broadcast_from_constant): Use CONST_VECTOR_P instead of comparison against GET_CODE. (ix86_gen_bcst_mem): Likewise. (ix86_ternlog_le

[match.pd PATCH] PR tree-optimization/114661: Generalize MULT_EXPR recognition (take #2)

2024-07-14 Thread Roger Sayle
tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-07-14 Roger Sayle Richard Biener gcc/ChangeLog PR tree-optimization/114661 * match.pd ((X*C1)|(X*C2) to

[x86 PATCH] Tweak i386-expand.cc to restore bootstrap on RHEL.

2024-07-14 Thread Roger Sayle
ke bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures (from this change). Ok for mainline? 2024-07-14 Roger Sayle * config/i386/i386-expand.cc (ix86_expand_fp_absneg_operator): Use E_?Fmode enumeration constants in switch statement.

Re: [pushed] Add function filtering to gcov

2024-07-14 Thread Roger Sayle
I’m seeing (dejagnu) testsuite problems from this (recent) patch. Running /home/roger/GCC/patchem/gcc/testsuite/gcc.misc-tests/gcov.exp ... ERROR: (DejaGnu) proc "lmap key { snd } { if { $key in $seen } continue set key }" does not exist. The error code is NONE The info on th

RE: [PATCH] Use foreach, not lmap, for tcl <= 8.5 compat

2024-07-16 Thread Roger Sayle
Hi Jørgen, Awesome. Very many thanks for the speedy fix. Roger -- > -Original Message- > From: Jørgen Kvalsvik > Sent: 14 July 2024 20:46 > To: gcc-patches@gcc.gnu.org > Cc: jeffreya...@gmail.com; ro...@nextmovesoftware.com; Jørgen Kvalsvik > > Subject: [PATCH] Use foreach, not lmap,

[PATCH] Implement a -ftrapping-math/-fsignaling-nans TODO in match.pd.

2024-07-17 Thread Roger Sayle
e? 2024-07-17 Roger Sayle gcc/ChangeLog * match.pd ((FTYPE) N CMP CST): Only worry about exceptions with flag_trapping_math, and about signaling NaNs with HONOR_SNANS. gcc/testsuite/ChangeLog * c-c++-common/pr57371-4.c: Update comment. * c-c++-common/pr57371-5

[x86 SSE] Improve handling of ternlog instructions in i386/sse.md (v2)

2024-05-17 Thread Roger Sayle
64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-05-17 Roger Sayle Hongtao Liu gcc/ChangeLog PR target/115021 * config/i386/i386-expand.cc (ix86_expand

[PATCH] Avoid ICE in except.cc on targets that don't support exceptions.

2024-05-22 Thread Roger Sayle
This patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu with no new failures in the testsuite, and ~220 fewer FAILs. Ok for mainline? 2024-05-22 Roger Sayle gcc/ChangeLog * except.cc (output_function_exception_table): Move call to get_personality

[x86_64 PATCH] Correct insn_cost of movabsq.

2024-05-22 Thread Roger Sayle
e -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-05-22 Roger Sayle gcc/ChangeLog * config/i386/i386.cc (ix86_rtx_costs) : A CONST_INT that isn't x86_64_immediate_operand requires an extra (expensive) movabsq in

[x86 PATCH] Tweak ix86_mode_can_transfer_bits to restore bootstrap on RHEL.

2024-08-08 Thread Roger Sayle
DFmode being "non-literal types in constant expressions". This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, with no new failures. Ok for mainline? 2024-08-08 Roger Sayle gcc/ChangeLog * config/i386/i386.cc (ix86_mode_can_transfer_bit

[x86 PATCH] PR target/116275: Handle STV of *extenddi2_doubleword_highpart

2024-08-11 Thread Roger Sayle
h and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-08-11 Roger Sayle gcc/ChangeLog PR target/116275 * config/i386/i386.md (*extendv2di2_highpart_stv_noavx512vl): New define_insn_and_split to handle the STV conversion of the DImode pa

RE: [PATCH] Re-add calling emit_clobber in lower-subreg.cc's resolve_simple_move.

2024-08-13 Thread Roger Sayle
Hi Xianmiao, I have no objection to reverting that original patch, if it was indeed made obsolete by later changes to the i386 backend. The theory at the time was that it was possible for backends to define mov instructions that emitted clobbers if necessary, but it's very difficult for a backen

[x86 PATCH] Improve split of *extendv2di2_highpart_stv_noavx512vl.

2024-08-15 Thread Roger Sayle
which applies when not performing the above optimization, i.e. on TARGET_XOP. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-08-15 Roger Sayle Uros B

[x86_64 PATCH] Support wide immediate constants in STV.

2024-08-15 Thread Roger Sayle
instruction. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-08-15 Roger Sayle gcc/ChangeLog * config/i386/i386-features.cc (timode_immed_const_gain): New

[x86_64 PATCH] Update STV's gains for TImode arithmetic right shifts on AVX2.

2024-08-24 Thread Roger Sayle
ithout --target_board=unix{-m32} with no new failures. No new testcase (yet) as the code for both the vector and scalar forms of the above function are still suboptimal so code generation is in flux, but this improvement should be a step in the right direction. Ok for mainline? 2024-08-24 Roger Sayle

[x86 PATCH] PR target/115351: RTX costs for *concatditi3 and *insvti_highpart.

2024-06-07 Thread Roger Sayle
e? 2024-06-07 Roger Sayle gcc/ChangeLog PR target/115351 * config/i386/i386.cc (ix86_rtx_costs): Provide estimates for the *concatditi3 and *insvti_highpart patterns, about two insns. gcc/testsuite/ChangeLog PR target/115351 * g++.target/i386/pr1153

[analyzer PATCH] Restore bootstrap with g++ 4.8.

2024-06-07 Thread Roger Sayle
using "scl enable devetoolset-10") as host compilers. Ok for mainline? 2024-06-07 Roger Sayle gcc/analyzer/ChangeLog * constraint-manager.cc (equiv_class::make_dump_widget): Use std::move to return a std::unique_ptr. (bounded_ranges_constraint::make_dump_wi

[x86 PATCH] PR target/115397: AVX512 ternlog vs. -m32 -fPIC constant pool.

2024-06-10 Thread Roger Sayle
x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-10 Roger Sayle gcc/ChangeLog PR target/115397 * config/i386/i386-expand.cc (ix86_expand_te

[x86 PATCH] More use of m{32,64}bcst addressing modes with ternlog.

2024-06-12 Thread Roger Sayle
ret// 1 = 42 total This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-12 Roger Sayle gcc/ChangeLog * config/i386/i38

[x86 PATCH] Allow all register_operand SUBREGs in x86_ternlog_idx.

2024-06-18 Thread Roger Sayle
ode V4SF. This patch allows the recently added ternlog_operand to accept this case. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-18 Roger Sayle gcc/C

[PATCH v2] PR tree-opt/113673: Avoid load merging when potentially trapping.

2024-06-21 Thread Roger Sayle
ke bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-21 Roger Sayle Richard Biener gcc/ChangeLog PR tree-optimization/113673 * gimple-ssa-store-merging.cc (find_bswap_or_nop_lo

[ARC PATCH] Improved SImode conditional moves (improves DImode shifts).

2024-06-22 Thread Roger Sayle
sue is also described at https://github.com/foss-for-synopsys-dwc-arc-processors/gcc/issues/110 Tested with a cross-compiler to arc-linux hosted on x86_64, with no new (compile-only) regressions from make -k check. Ok for mainline if this passes Claudiu's and/or Jeff's testing? 20

[ARC PATCH] Improve performance of SImode right shifts (take #2)

2024-07-22 Thread Roger Sayle
opsys, is anyone able to test these changes? Thanks in advance. 2024-07-22 Roger Sayle gcc/ChangeLog * config/arc/arc-protos.h (output_rlc_loop): Prototype here. (arc_split_rlc): Prototype here. * config/arc/arc.cc (output_rlc_loop): Output a zero-overhead loop o

[testsuite PATCH] Robustify lib/g++.exp

2024-07-22 Thread Roger Sayle
#x27;s no harm in (also) confirming that it exists in g++_include_flags. This patch has been tested on x86_64-pc-linux-gnu (where it allows a cross-compiler to arc-linux to produce g++ compilation results). Ok for mainline? 2024-07-22 Roger Sayle gcc/testsuite/ChangeLog * lib/g++.

[match.pd PATCH] Fold ctz(-x) as ctz(x).

2024-07-23 Thread Roger Sayle
with no new failures. Ok for mainline? 2024-07-23 Roger Sayle gcc/ChangeLog * match.pd (ctz (-X) => ctz (X)): New simplification. gcc/testsuite/ChangeLog * gcc.dg/fold-ctz-1.c: New test case. Thanks in advance, Roger -- diff --git a/gcc/match.pd b/gcc/match.pd index 6818856..

[nvptx PATCH] Implement isfinite and isnormal optabs in nvptx.md.

2024-07-27 Thread Roger Sayle
/pipermail/gcc-patches/2024-July/657881.html [which I'm sad to see is taking a while to review/get approved]. Ok for mainline? 2024-07-27 Roger Sayle gcc/ChangeLog * config/nvptx/nptx.md (UNSPEC_COPYSIGN): No longer required. (UNSPEC_ISFINITE): New UNSPEC. (

[PATCH] PR tree-optimization/57371: Optimize (float)i == 16777222.0f sometimes.

2024-07-28 Thread Roger Sayle
make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? If the testcases need to be tweaked for non-IEEE targets (the transformations themselves should be portable to VAX and IBM floating point formats) hopefully that can be done as follow-up patches

[x86_64 PATCH] Refactor V2DI arithmetic right shift expansion for STV.

2024-08-05 Thread Roger Sayle
ficial). This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-08-05 Roger Sayle gcc/ChangeLog * config/i386/i386-expand.cc (ix86_expand_v2di_ashiftrt): New

[x86_64 PATCH] Support memory destinations and wide immediate constants in STV.

2024-08-05 Thread Roger Sayle
-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-08-05 Roger Sayle gcc/ChangeLog * config/i386/i386-features.cc (timode_immed_const_gain): New function to determine the gain/cost on a CONST_

RE: [x86_64 PATCH] Refactor V2DI arithmetic right shift expansion for STV.

2024-08-07 Thread Roger Sayle
e has been committed as obvious. Sorry again for the inconvenience. Tested on x86_64-pc-linux-gnu with RUNTESTFLAGS="dg.exp=sse2-pr85572-1.C". 2024-08-07 Roger Sayle gcc/testsuite/ChangeLog * g++.dg/other/sse2-pr85572-1.C: Update expected output after my recent patc

[PATCH] PR middle-end/111701: signbit(x*x) vs -fsignaling-nans

2024-04-26 Thread Roger Sayle
c-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-04-26 Roger Sayle gcc/ChangeLog PR middle-end/111701 * fold-const.cc (tree_binary_nonnegative_warnv_p) : Split handling of flo

[PATCH] PR tree-opt/113673: Avoid load merging from potentially trapping additions.

2024-04-28 Thread Roger Sayle
updating the CFG is a part of the compiler that I'm less familiar with. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-04-28 Roger Sayle gcc/ChangeL

[C PATCH] PR c/109618: ICE-after-error from error_mark_node.

2024-04-29 Thread Roger Sayle
ng away) a CEIL_DIV_EXPR in the common case that "char" is a single-byte. The current code relies on the middle-end's tree folding to recognize that CEIL_DIV_EXPR of integer_one_node is a no-op, that can be optimized away. Ok for mainline? 2024-04-30 Roger Sayle gcc/c-family/Chan

RE: [C PATCH] PR c/109618: ICE-after-error from error_mark_node.

2024-04-30 Thread Roger Sayle
which does more of a tree traversal checking error_operand_p within the unary and binary operators of an expression tree. Please let me know what you think/recommend. Best regards, Roger -- > -Original Message- > From: Richard Biener > Sent: 30 April 2024 08:38 > To: Roger Sayle >

RE: [C PATCH] PR c/109618: ICE-after-error from error_mark_node.

2024-04-30 Thread Roger Sayle
> On Tue, Apr 30, 2024 at 10:23 AM Roger Sayle > wrote: > > Hi Richard, > > Thanks for looking into this. > > > > It’s not the call to size_binop_loc (for CEIL_DIV_EXPR) that's > > problematic, but the call to fold_convert_loc (loc, size_type_node,

RE: [PATCH] PR middle-end/111701: signbit(x*x) vs -fsignaling-nans

2024-05-02 Thread Roger Sayle
> From: Richard Biener > On Fri, Apr 26, 2024 at 10:19 AM Roger Sayle > wrote: > > > > This patch addresses PR middle-end/111701 where optimization of > > signbit(x*x) using tree_nonnegative_p incorrectly eliminates a > > floating point multiplication whe

RE: [PATCH] PR middle-end/111701: signbit(x*x) vs -fsignaling-nans

2024-05-02 Thread Roger Sayle
> From: Richard Biener > On Thu, May 2, 2024 at 11:34 AM Roger Sayle > wrote: > > > > > > > From: Richard Biener On Fri, Apr 26, > > > 2024 at 10:19 AM Roger Sayle > > > wrote: > > > > > > > > This patch address

[x86 PATCH] Improve V[48]QI shifts on AVX512

2024-05-09 Thread Roger Sayle
ch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-05-09 Roger Sayle gcc/ChangeLog * config/i386/i386-expand.cc (ix86_expand_vecop_qihi_partial): Don

Re: [x86 PATCH] Improve V[48]QI shifts on AVX512

2024-05-10 Thread Roger Sayle
his weekend. Thanks again, Roger > From: Hongtao Liu > On Fri, May 10, 2024 at 6:26 AM Roger Sayle > wrote: > > > > > > The following one line patch improves the code generated for V8QI and > > V4QI shifts when AV512BW and AVX512VL functionality is available. &

[x86 SSE] Improve handling of ternlog instructions in i386/sse.md

2024-05-12 Thread Roger Sayle
inux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-05-12 Roger Sayle gcc/ChangeLog PR target/115021 * config/i386/i386-expand.cc (ix86_expand_args_builtin): Call fixup_modeless_co

[x86_64 PATCH] Support read-modify-write memory operands in STV.

2024-08-31 Thread Roger Sayle
xmm0 vmovdqa %xmm0, m(%rip) ret This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-08-31 Roger Sayle gcc/ChangeLog * config/i386/i386-feature

<    1   2   3   4   5   6   7   >