[PATCH take #2] PR tree-optimization/71343: Optimize (X<

2022-08-12 Thread Roger Sayle
-m32} with no new failures. Ok for mainline? 2022-08-12 Roger Sayle Richard Biener gcc/ChangeLog PR tree-optimization/71343 * match.pd (op (lshift @0 @1) (lshift @2 @1)): Optimize the expression (X<>C)^(Y>>C) to (X^Y)>>C for b

RE: [PATCH] PR tree-optimization/64992: (B << 2) != 0 is B when B is Boolean.

2022-08-12 Thread Roger Sayle
Hi Richard, > -Original Message- > From: Richard Biener > Sent: 08 August 2022 12:49 > Subject: Re: [PATCH] PR tree-optimization/64992: (B << 2) != 0 is B when B is > Boolean. > > On Mon, Aug 8, 2022 at 11:06 AM Roger Sayle > wrote: > > > > Thi

[x86_64 PATCH] Support shifts and rotates by integer constants in TImode STV.

2022-08-15 Thread Roger Sayle
target_board=unix{-m32}, with no new failures. Ok for mainline? 2022-08-15 Roger Sayle gcc/ChangeLog * config/i386/i386-features.cc (timode_scalar_chain::compute_convert_gain): Provide costs for shifts and rotates. Provide gains for comparisons against 0/-1.

[Committed] PR target/106640: Fix use of XINT in TImode compute_convert_gain.

2022-08-17 Thread Roger Sayle
I was thinking. Corrected by the patch below, tested on x86_64-pc-linux-gnu with make bootstrap, both with and without --enable-checking=rtl, and regression tested, both with and without --target_board=unix{-m32}, with no new failures. Committed to mainline as obvious. 2022-08-17 Roger Sayle gcc/Chang

[PATCH] PR rtl-optimization/106594: Preserve zero_extend when cheap.

2022-09-11 Thread Roger Sayle
ne? Fingers-crossed, Uros can review the x86 backend changes, which are potentially independent (fixing regressions caused by the middle-end changes), but included in this post to provide better context. TIA. 2022-09-12 Roger Sayle gcc/ChangeLog PR rtl-optimization/106594

RE: [PATCH] PR rtl-optimization/106594: Preserve zero_extend when cheap.

2022-09-12 Thread Roger Sayle
Hi Richard, > "Roger Sayle" writes: > > This patch addresses PR rtl-optimization/106594, a significant > > performance regression affecting aarch64 recently introduced (exposed) > > by one of my recent RTL simplification improvements. Firstly many > > tha

[PATCH] PR target/106877: Robustify reg-stack to malformed asm.

2022-09-13 Thread Roger Sayle
cc.gnu.org/pipermail/gcc-patches/2018-March/495193.html at this second (similar) gcc_assert. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2022-09-13 Roger Sayle gcc

[x86 PATCH] PR target/92578: Peephole2s to tweak cmove register allocation.

2022-04-25 Thread Roger Sayle
ion is one source of confusion when comparing code generation with vs. without cmove (the other major source of confusion being that well-predicted branches are free, but that prediction-quality is poorly predictable). 2022-04-25 Roger Sayle gcc/ChangeLog PR target/92578

[Committed] PR testsuite/105486: Use "signed char" in gcc.dg/pr102950.c

2022-05-05 Thread Roger Sayle
citly using "signed char" so that it's testing the intended EVRP behaviour portably. Committed as obvious. 2022-05-05 Roger Sayle gcc/testsuite/ChangeLog PR testsuite/105486 * gcc.dg/pr102950.c: Use explicit "signed char" in test case. Roger -- dif

RE: [PATCH] PR tree-optimization/83907: Improved memset handling in strlen pass.

2022-05-12 Thread Roger Sayle
Hi Jeff, Any chance you could take a look at this patch, now that we're back in stage1? Thanks in advance, Roger > -Original Message- > From: Jeff Law > Sent: 02 March 2022 19:33 > To: Roger Sayle ; gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] PR tree-optimiza

[x86 PATCH take 2] Improved V1TI (and V2DI) mode equality/inequality.

2022-05-13 Thread Roger Sayle
93454.html This revised patch has been (re)tested against mainline (GCC 13) on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2022-05-13 Roger Sayle Uroš Bizjak gcc

[x86 PATCH take 2] Avoid andn and generate shorter not;and with -Oz.

2022-05-17 Thread Roger Sayle
s for LEGACY_INT_REG_P and REX_INT_REG_P. This patch has been tested against gcc13 trunk on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2022-05-17 Roger Sayle gcc/ChangeLog * config

[x86 PATCH] Correct ix86_rtx_cost for multi-word multiplication.

2022-05-17 Thread Roger Sayle
selector for the upcoming new test case gcc.target/i386/pr98865.c. Ok for mainline? 2022-05-17 Roger Sayle gcc/ChangeLog * config/i386/i386.cc (ix86_rtx_costs) [MULT]: When mode size is wider than word_mode, a multiplication costs three word_mode multiplications and two

[PATCH] Simplify logic in tree-scalar-evolution's expensive_expression_p.

2022-05-17 Thread Roger Sayle
owed, they'll need to be managed explicitly with SAVE_EXPR (or similar mechanism) that avoids exponential growth. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Is this a reasonable cl

[PATCH take #2] PR middle-end/98865: Expand X*Y as X&-Y when Y is [0.1].

2022-05-18 Thread Roger Sayle
ke -k check, both with and without --target_board=unix{-m32}, with no new failures. Many thanks to Uros for the speedy review and approval of my x86_rtx_costs patch that enables this transformation on -m32 using the correct cost of DImode multiplication. Ok for mainline? 2022-05-18 Roger

[x86 PATCH] Some additional ix86_rtx_costs clean-ups: NEG, AND and pandn.

2022-05-18 Thread Roger Sayle
oth with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2022-05-18 Roger Sayle gcc/ChangeLog * config/i386/i386.cc (ix86_rtx_costs): Split AND from XOR/IOR. Multi-word binary logic operations require two instructions. For vector integer m

RE: [x86 PATCH] Some additional ix86_rtx_costs clean-ups: NEG, AND and pandn.

2022-05-20 Thread Roger Sayle
nline? 2022-05-20 Roger Sayle gcc/ChangeLog * config/i386/i386.cc (ix86_rtx_costs): Split AND from XOR/IOR. Multi-word binary logic operations require two instructions. For vector integer modes, AND with a NOT operand requires only a single instruction (pandn). Lik

[PING] PR middle-end/95126: Expand small const structs as immediate constants

2022-05-21 Thread Roger Sayle
nst gcc 13 trunk on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2022-05-21 Roger Sayle gcc/ChangeLog PR middle-end/95126 * calls.cc (load_register_parameters): When loading

[PATCH] Simplify vec_unpack of uniform_vector_p constructors in match.pd.

2022-05-21 Thread Roger Sayle
ackhq, with punpcklqdq. This transformation is also useful for helping CSE to spot that unpack_hi and unpack_lo are equivalent. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check with no new failures. Ok for mainline? 2022-05-21 Roger Sayle gcc/Chan

[PATCH] Minor improvement to genpreds.cc

2022-05-22 Thread Roger Sayle
if (str[1] == 'c') ... The equivalent optimization is performed by GCC (but perhaps not by the host compiler), but generating simpler/smaller code may encourage further optimizations (such as use of a switch statement). This patch has been tested on x86_64-pc-linux-gnu with make boot

[PATCH] PR tree-optimization/105668: Provide RTL expansion for VEC_COND_EXPR.

2022-05-22 Thread Roger Sayle
oth with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2022-05-23 Roger Sayle gcc/ChangeLog PR tree-optimization/105668 * expr.cc (expand_vec_cond_expr): New function to expand VEC_COND_EXPR using vector mode logical instructions.

[x86 PATCH] PR tree-optimization/105668: Provide vcond_mask_v1tiv1ti pattern.

2022-05-23 Thread Roger Sayle
x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. The new test case is identical to the middle-end patch, so if both patches are approved, this'll be committed only once. Ok for mainline? 2022-05-23 Roger Sayle

[x86 PING] Peephole pand;pxor into pandn

2022-05-23 Thread Roger Sayle
--target_board=unix{-m32}, with no new failures. Ok for mainline? 2022-05-23 Roger Sayle gcc/ChangeLog * config/i386/sse.md (peephole2): Convert suitable pand followed by pxor into pandn, i.e. (X&Y)^X into X & ~Y. Many thanks in advance, Roger -- diff --git a/gcc/config/i386/

RE: [x86 PING] Peephole pand;pxor into pandn

2022-05-23 Thread Roger Sayle
- > From: Uros Bizjak > Sent: 23 May 2022 09:51 > To: Roger Sayle > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [x86 PING] Peephole pand;pxor into pandn > > On Mon, May 23, 2022 at 10:44 AM Roger Sayle > wrote: > > > > > > This is a ping of a patch

RE: [x86 PING] Peephole pand;pxor into pandn

2022-05-23 Thread Roger Sayle
ed to the backend, to avoid potential regressions related to code size (-Os and -Oz). It's a long road with many steps. Might you reconsider? Pretty please? Roger -- > -Original Message- > From: Uros Bizjak > Sent: 23 May 2022 10:11 > To: Roger Sayle > Cc: gcc-

[PATCH/RFC] PR tree-optimization/96912: Recognize VEC_COND_EXPR in match.pd

2022-05-23 Thread Roger Sayle
exceptions being built-in functions, IFN_* etc.] Should tree.texi document which tree codes can't be used without checking the backend. Bootstrapped and regression tested, but this obviously depends upon RTL expansion being able to perform the inverse operation/lowering if required. 20

RE: [PATCH/RFC] PR tree-optimization/96912: Recognize VEC_COND_EXPR in match.pd

2022-05-23 Thread Roger Sayle
eady thought about for a long time but without coming up a satisfactory solution]. Thanks again for your assistance/direction. Cheers, Roger -- > -Original Message- > From: Richard Biener > Sent: 23 May 2022 14:36 > To: Roger Sayle > Cc: GCC Patches > Subject: Re:

[x86 PATCH] Optimize double word negation of zero extended values.

2022-05-23 Thread Roger Sayle
tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2022-05-23 Roger Sayle gcc/ChangeLog * config/i386/i386.md (peephole2): Convert xor;neg;adc;neg, i.e. a double word negation

[PATCH] Canonicalize X&-Y as X*Y in match.pd when Y is [0,1].

2022-05-24 Thread Roger Sayle
atch includes three additional optimizations (that account for the change in canonical form) to continue to optimize PR92834 and PR94786. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures.

[PATCH] Correct implementation of wi::clz

2021-09-05 Thread Roger Sayle
is a multiple of HOST_BITS_PER_WIDE_INT. The fix is simply to reorder/shuffle the existing tests. This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap" and "make -k check" with no new failures. Ok for mainline? 2021-09-05 Roger Sayle gcc/ChangeLog

[PATCH] Simplify paradoxical subreg extensions of TRUNCATE

2021-09-05 Thread Roger Sayle
een tested on x86_64-pc-linux-gnu with "make bootstrap" and "make -k check" with no new failures, and also on nvptx-none with no new failures. Ok for mainline? 2021-09-05 Roger Sayle gcc/ChangeLog * simplify-rtx.c (simplify_subreg): Optimize paradoxical subreg

RE: [PATCH] Simplify paradoxical subreg extensions of TRUNCATE

2021-09-06 Thread Roger Sayle
? Let me know what you think. Best regards, Roger -- -Original Message- From: Segher Boessenkool Sent: 06 September 2021 11:14 To: Roger Sayle Cc: 'GCC Patches' Subject: Re: [PATCH] Simplify paradoxical subreg extensions of TRUNCATE On Sun, Sep 05, 2021 at 11:28:30PM

[PATCH] More NEGATE_EXPR folding in match.pd

2021-09-09 Thread Roger Sayle
s patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap" and "make -k check" with no new failures. Ok for mainline? 2021-09-09 Roger Sayle gcc/ChangeLog * generic-match-head.c (single_use_is_op_p): New helper function. * gimple-match-he

[PATCH Take 2] More NEGATE_EXPR folding in match.pd

2021-09-10 Thread Roger Sayle
;make -k check" with no new failures. Ok for mainline? 2021-09-10 Roger Sayle Richard Biener gcc/ChangeLog * match.pd (negation simplifications): Implement some negation folding transformations from fold-const.c's fold_negate_ex

[PATCH] Also preserve SUBREG_PROMOTED_VAR_P in expr.c's convert_move.

2021-09-11 Thread Roger Sayle
even noticed a problem) minimized the inconvenience]. This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap" and "make -k check" with no new failures, and on a cross-compiler to nvptx-none, with no new failures in its testsuite. OK for mainline? 2021-

[PATCH] PR c/102245: Don't warn that ((_Bool)x<<0) isn't a truthvalue.

2021-09-13 Thread Roger Sayle
een tested on x86_64-pc-linux-gnu with "make bootstrap" and "make -k check" with no new failures. Ok for mainline? 2021-09-13 Roger Sayle gcc/c-family/ChangeLog PR c/102245 * c-common.c (c_common_truthvalue_conversion) [LSHIFT_EXPR]: Specia

[PATCH #2] PR c/102245: Disable sign-changing optimization for shifts by zero.

2021-09-14 Thread Roger Sayle
ap" and "make -k check" with no new failures. Note that test1 in the new testcase is changed from dg-bogus to dg-warning compared with version #1. Ok for mainline? 2021-09-14 Roger Sayle gcc/ChangeLog PR c/102245 * match.pd (shift optimizations): Disable recent s

[PATCH] nvptx: Add (experimental) support for HFmode with -misa=sm_53

2021-09-16 Thread Roger Sayle
description (follow-up patches) in future. I'm happy to defer these changes/hunks until later if reviewers prefer. The following has been tested on nvptx-none, hosted on x86_64-pc-linux-gnu with a "make" and "make -k check" with no new failures. Ok for mainline? 2020-09-16 Rog

[PATCH] nvptx: Adds uses of -misa=sm_75 and -misa=sm_80

2021-09-17 Thread Roger Sayle
ially in future). Are both parts Ok for mainline? 2020-09-17 Roger Sayle gcc/ChangeLog * config/nvptx/nvptx.md (define_c_enum "unspec"): New UNSPEC_TANH. (define_mode_iterator HSFM): New iterator for HFmode and SFmode. (exp2hf2): New define_insn controlle

[PATCH] PR middle-end/88173: More constant folding of NaN comparisons.

2021-09-18 Thread Roger Sayle
to double check/research whether and why those checks may have been removed in the past]. This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap" and "make -k check" with no new failures. Ok for mainline? 2021-09-18 Roger Sayle gcc/ChangeLog P

[RFC/PATCH] C++ constexpr vs. floating point exceptions.

2021-09-21 Thread Roger Sayle
vior. Ideally, what the front-end considers valid should be independent of whether the user specified -fno-trapping-math (or -ffast-math) to the middle-end. Thoughts? Ok for mainline? 2021-09-21 Roger Sayle gcc/cp/ChangeLog * constexpr.c (cxx_eval_outermost_const_expr): Temporarily disable

RE: [PATCH] Simplify paradoxical subreg extensions of TRUNCATE

2021-09-21 Thread Roger Sayle
2 To: Richard Biener via Gcc-patches Cc: Segher Boessenkool ; Richard Biener ; Roger Sayle Subject: Re: [PATCH] Simplify paradoxical subreg extensions of TRUNCATE [Using this is a convenient place to reply to the thread as a whole] Richard Biener via Gcc-patches writes: > On Mon, Sep 6, 2

RE: [RFC/PATCH] C++ constexpr vs. floating point exceptions.

2021-09-21 Thread Roger Sayle
Can you double check? Integer division by zero is undefined, but isn't floating point division by zero defined by the appropriate IEEE standards? Roger -- -Original Message- From: Xi Ruoyao Sent: 21 September 2021 14:07 To: Roger Sayle ; 'GCC Patches' Subject: Re

RE: [RFC/PATCH] C++ constexpr vs. floating point exceptions.

2021-09-21 Thread Roger Sayle
t: 21 September 2021 14:22 To: Roger Sayle ; Jason Merrill ; Jonathan Wakely Cc: 'Xi Ruoyao' ; 'GCC Patches' Subject: Re: [RFC/PATCH] C++ constexpr vs. floating point exceptions. On Tue, Sep 21, 2021 at 02:15:59PM +0100, Roger Sayle wrote: > Can you double check? Integer

[PATCH] Make flag_trapping_math a non-binary Boolean.

2021-09-25 Thread Roger Sayle
x86_64-pc-linux-gnu with "make bootstrap" and "make -k check", all languages including Ada, with no new failures. Ok for mainline? 2021-09-25 Roger Sayle gcc/ChangeLog * flag-types.h (trapping_math_model): New enumeration (of bits) specifying possible floa

[PATCH] Introduce sh_mul and uh_mul RTX codes for high-part multiplications

2021-09-25 Thread Roger Sayle
he @xref warnings in invoke.texi. This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap" and "make -k check" with no new failures. Ok for mainline? 2021-09-25 Roger Sayle gcc/ChangeLog * gcc/rtl.def (SH_MULT, UH_MULT): New RTX codes for representing

[RFC] Experimental __attribute__((saturating)) on integer types.

2021-09-26 Thread Roger Sayle
Thoughts? Even if a new C-family attribute is unsuitable, is my logic/implementation in handle_saturating_attribute correct? 2021-09-26 Roger Sayle gcc/c-family/ChangeLog * c-attribs (handle_saturating_attribute): New callback function for a "saturating" attribute

RE: [PATCH] Make flag_trapping_math a non-binary Boolean.

2021-09-28 Thread Roger Sayle
me know if you strongly feel all FP traps must be treated the same by the middle-end. Indeed, if flag_trapping_math is restricted to only be FLAG_TRAPPING_DEFAULT in a front-end(s), they will be. Best regards, Roger -- -Original Message- From: Joseph Myers Sent: 27 September 2021 21:05

[PATCH #2] Introduce smul_highpart and umul_highpart RTX for high-part multiplications

2021-09-29 Thread Roger Sayle
in the four new attached test cases. This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap" and "make -k check" with no new failures. Ok for mainline? 2021-09-29 Roger Sayle Richard Sandiford gcc/ChangeLog * gcc/rtl.def (SMU

[PATCH] x86_64: Expand ashrv1ti (and PR target/102986)

2021-10-30 Thread Roger Sayle
two pieces, but the functionality overlaps and this patch was nearly ready to submit to gcc-patches when 102986 appeared in bugzilla. 2021-10-30 Roger Sayle gcc/ChangeLog PR target/102986 * config/i386/i386-expand.c (ix86_expand_v1ti_to_ti, ix86_expand_ti_to_v1ti):

[PATCH Take #2] x86_64: Expand ashrv1ti (and PR target/102986)

2021-10-31 Thread Roger Sayle
Roger Sayle Jakub Jelinek gcc/ChangeLog PR target/102986 * config/i386/i386-expand.c (ix86_expand_v1ti_to_ti, ix86_expand_ti_to_v1ti): New helper functions. (ix86_expand_v1ti_shift): Check if the amount operand is an integer constant, and expand

[PATCH] x86_64: Improved implementation of TImode rotations.

2021-11-01 Thread Roger Sayle
bootstrap and make -k check with no new failures. Interestingly the correct behaviour is already tested by (amongst other tests) sse2-v1ti-shift-3.c that confirms V1TImode rotates by constants match rotlti3/rotrti3. Ok for mainline? 2021-11-01 Roger Sayle * config/i386/i386.md (ti3):

Some PINGs

2021-11-06 Thread Roger Sayle
I wonder if reviewers could take a look (or a second look) at some of my outstanding patches. Four nvptx backend patches: nvptx: Use cvt to perform sign-extension of truncation. https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578256.html nvptx: Add (experimental) support for HFmode with

RE: Some PINGs

2021-11-07 Thread Roger Sayle
>On 11/6/2021 4:20 PM, Roger Sayle wrote: >> Simplify paradoxical subreg extensions of TRUNCATE >> https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578848.html > So the discussion seemed to end with a recommendation to try and address this > earlier in the call

RE: Some PINGs

2021-11-08 Thread Roger Sayle
Hi Richard, >> I wonder if reviewers could take a look (or a second look) at some of >> my outstanding patches. >> PR middle-end/100810: Penalize IV candidates with undefined value >> bases >> https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578441.html > > I did comment on this one, not

[PATCH] ivopts: Improve code generated for very simple loops.

2021-11-15 Thread Roger Sayle
), %rdx movq%rdx, (%rsi,%rax) addq$8, %rax cmpq%rax, %rcx jne .L3 .L1:ret This patch has been tested on x86_64-pc-linux-gnu with a make bootstrap and make -k check with no new failures. Ok for mainline? 2021-11-15 Roger Sayle gcc/Ch

[PATCH] x86_64: Avoid rorx rotation instructions with -Os

2021-11-15 Thread Roger Sayle
pc-linux-gnu with make bootstrap and make -k check with no new failures. Ok for mainline? 2021-11-15 Roger Sayle gcc/ChangeLog * config/i386/i386.md (*bmi2_rorx_1): Make conditional on !optimize_function_for_size_p. (*3_1): Add preferred_for_

[PATCH] PR middle-end/53267: Constant fold BUILT_IN_FMOD.

2021-06-08 Thread Roger Sayle
Here's a three line patch to implement constant folding for fmod, fmodf and fmodl, which resolves an enhancement request from 2012. The following patch has been tested on x86_64-pc-linux-gnu with a make bootstrap and make -k check with no new failures. Ok for mainline? 2020-06-08 Roger

RE: [PATCH] PR middle-end/53267: Constant fold BUILT_IN_FMOD.

2021-06-09 Thread Roger Sayle
ecade ago. Having someone help with committing patches is always very much appreciated. Cheers, Roger -- -Original Message- From: Jeff Law Sent: 09 June 2021 16:27 To: Richard Biener ; Roger Sayle Cc: GCC Patches Subject: Re: [PATCH] PR middle-end/53267: Constant fold BUILT_IN_FMOD. On

[PATCH] PR tree-optimization/96392 Optimize x+0.0 if x is an integer

2021-06-10 Thread Roger Sayle
itional folding transformations from "testing the type" to "testing the value". The following patch has been tested on x86_64-pc-linux-gnu with a make bootstrap and make -k check with no new failures. Ok for mainline? 2020-06-10 Roger Sayle gcc/ChangeLog PR tree-optimi

[PATCH] Try placing RTL folded constants in constant pool

2021-10-03 Thread Roger Sayle
ark this for me that would me much appreciated. This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap" and "make -k check" with no new failures. Ok for mainline? 2021-10-03 Roger Sayle gcc/ChangeLog * combine.c (recog_for_combine)

[PATCH] Transition nvptx backend to STORE_FLAG_VALUE = 1

2021-10-05 Thread Roger Sayle
the RTL passes to eventually generate a much shorter sequence using an and.pred instruction (just like Nvidia's nvcc compiler). This patch has been tested nvptx-none with a "make" and "make -k check" (including newlib) hosted on x86_64-pc-linux-gnu with no new failures. Ok for

[Committed] Tweak new test cases for -march=cascadelake strangeness.

2021-10-08 Thread Roger Sayle
g cause of these differences. Tested on x86_64-pc-linux-gnu (with and without -march=cascadelake). 2021-10-08 Roger Sayle gcc/testsuite/ChangeLog * gcc.target/i386/sse2-mmx-paddsb-2.c: Test for -128 or 128. * gcc.target/i386/sse2-mmx-paddusb-2.c: Test for -1 or 255. *

[PATCH] x86_64: Some SUBREG related optimization tweaks to i386 backend.

2021-10-11 Thread Roger Sayle
ested on x86_64-pc-linux-gnu with "make bootstrap" and "make -k check" with no new failures. In theory, my recent "obvious" regexp fix to accommodate -march=cascadelake is no longer required, but there's no harm leaving the testsuite as it is. Ok for ma

[PATCH v2] x86_64: Some SUBREG related optimization tweaks to i386 backend.

2021-10-13 Thread Roger Sayle
. I've left the existing behaviour the same, so that memory-to-memory moves (continue to) use ix86_gen_scatch_sse_rtx. This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap" and "make -k check" with no new failures. Ok for mainline? 2021-10-13 Rog

[PATCH] Allow early sets of SSE hard registers from standard_sse_constant_p

2021-10-15 Thread Roger Sayle
tested with "make bootstrap" and "make -k check" on x86_64-pc-linux-gnu with no new failures. Ok for mainline? Sorry again for the temporary inconvenience. 2021-10-15 Roger Sayle gcc/ChangeLog * config/i386/i386.c (ix86_hardreg_mov_ok): For vector modes, a

[PATCH] Constant fold SS_NEG and SS_ABS in simplify-rtx.c

2021-10-17 Thread Roger Sayle
pile to: _foo: nop; nop; nop; R0 = 32767 (X); rts; _bar: nop; nop; R0 = -1 (X); R0.H = 32767; rts; This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap" and "make -k check" with no new failures. O

[PATCH] bfin: Popcount-related improvements to machine description.

2021-10-17 Thread Roger Sayle
R1 = 1 (X); R0.L = ONES R0; R0 = R1 & R0; rts; This patch has been tested on a cross-compiler to bfin-elf hosted on x86_64-pc-linux-gnu, but without a toolchain, and shows no regressions in the compile-only parts of the testsuite. Ok for mainline? 2021-10-17 Roger Sayle

[PATCH] PR target/102785: Correct addsub/subadd patterns on bfin.

2021-10-18 Thread Roger Sayle
arted evaluating these expressions at compile-time, when the mismatch was caught by the testsuite. Many thanks to Jeff Law for confirming that this patch fixes these regressions on bfin-elf. Ok for mainline? 2021-10-18 Roger Sayle gcc/ChangeLog PR target/102785 * config/bfin/bf

[PATCH] x86_64: Add insn patterns for V1TI mode logic operations.

2021-10-22 Thread Roger Sayle
ke -k check" with no new failures. Ok for mainline? 2021-10-22 Roger Sayle gcc/ChangeLog * config/i386/sse.md (v1ti3): New define_insn to implement V1TImode AND, IOR and XOR on TARGET_SSE2 (and above). (one_cmplv1ti2): New define expand. gcc/testsuite/ChangeLog

[Committed] Correct testcase gcc.target/bfin/20090914-3.c

2021-10-24 Thread Roger Sayle
by turning the code into a function returning the final "fract32" result, as simply specifying an "int" return type for main, results in the entire function being optimized away, as the result is unused. Checked-in as obvious. 2021-10-24 Roger Sayle gcc/testsuite/ChangeLo

[PATCH] x86_64: Implement V1TI mode shifts/rotates by a constant

2021-10-24 Thread Roger Sayle
e -k check with no new failures. Ok for mainline? 2021-10-24 Roger Sayle gcc/ChangeLog * config/i386/i386-expand.c (ix86_expand_v1ti_shift): New helper function to expand V1TI mode logical shifts by integer constants. (ix86_expand_v1ti_rotate): New helper function to e

[PATCH] Constant fold/simplify SS_ASHIFT and US_ASHIFT in simplify-rtx.c

2021-10-25 Thread Roger Sayle
(X); rts; _stest_sat_min: nop; nop; nop; R0 = -32768 (X); rts; This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check with no new failures, and on a cross-compiler to bfin-elf with no regressions. Ok for mainline?

RE: [PATCH] x86_64: Implement V1TI mode shifts/rotates by a constant

2021-10-25 Thread Roger Sayle
ode size win with -Os (ashl_8 is currently 39 bytes, shrinks to 5 bytes with this patch). Please let me know what you think. Roger -- -Original Message- From: Uros Bizjak Sent: 25 October 2021 09:02 To: Roger Sayle Cc: GCC Patches Subject: Re: [PATCH] x86_64: Implement V1TI mode shifts/r

[PATCH] PR tree-opt/40210: Fold (bswap(X)>>C1)&C2 to (X>>C3)&C2 in match.pd

2021-07-06 Thread Roger Sayle
%rax shrq$56, %rax ret with this patch, it now compiles to foo:movzbl %dil, %eax ret This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap" and "make -k check" with no new failures. Ok for mainline? 2021-07

RE: [PATCH] PR tree-opt/40210: Fold (bswap(X)>>C1)&C2 to (X>>C3)&C2 in match.pd

2021-07-08 Thread Roger Sayle
n the attached revised patch, which has been tested on x86_64-pc-linux-gnu with a "make bootstrap" and "make -k check" with no new failures. Ok for mainline? 2021-07-08 Roger Sayle Richard Biener gcc/ChangeLog PR tree-optimization/40210 * matc

[x86_64 PATCH]: Improvement to signed division of integer constant.

2021-07-08 Thread Roger Sayle
are larger than cltd, so this transformation is prevented for -Os. This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap" and "make -k check" with no new failures. Ok for mainline? 2021-07-08 Roger Sayle gcc/ChangeLog * config/i386/i386.md (*div

[PATCH] PR tree-optimization/38943: Preserve trapping instructions with -fnon-call-exceptions

2021-07-08 Thread Roger Sayle
or mainline? 2021-07-08 Roger Sayle gcc/ChangeLog PR tree-optimization/38943 * gimple.c (gimple_has_side_effects): Consider trapping to be a side-effect when -fnon-call-exceptions is specified. (gimple_coult_trap_p_1): Make S argument a "const gimple*".

[PATCH take 2] PR tree-optimization/38943: Preserve trapping instructions with -fpreserve-traps

2021-07-10 Thread Roger Sayle
been tested on x86_64-pc-linux-gnu with a "make bootstrap" and "make -k check" with no new failures. Ok for mainline? 2021-07-09 Roger Sayle Eric Botcazou Richard Biener gcc/ChangeLog PR tree-optimization/38943 PR middle-end/39

[PATCH] PR tree-optimization/101403: Incorrect folding of ((T)bswap(x))>>C

2021-07-11 Thread Roger Sayle
lause to handle the hypothetical (but in practice impossible) sign-extension to an unsigned type T, which can implemented as (T)(x<<8)>>12. This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap" and "make -k check" with no new failures, and a new testca

[Committed] Make gimple_could_trap_p const-safe.

2021-07-13 Thread Roger Sayle
ake a const gimple (such as gimple_has_side_effects), and update its prototypes. These chunks have been (re)tested on x86_64-pc-linux-gnu with a "make bootstrap" and "make -k check" with no new failures. Committed to mainline as obvious/pre-approved. 2021-07-13 Roger Sayle

[PATCH] Fold bswap32(x) != 0 to x != 0 (and related transforms)

2021-07-18 Thread Roger Sayle
simplifies to X eq/ne -1, if Y has no side-effects. This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap" and "make -k check" with no new failures. Ok for mainline? 2010-07-18 Roger Sayle gcc/ChangeLog * match.pd (rotate): Simplify eq

[PATCH take 2] Fold bswap32(x) != 0 to x != 0 (and related transforms)

2021-07-24 Thread Roger Sayle
make bootstrap and make -k check with no new failures. Ok for mainline? 2010-07-24 Roger Sayle Marc Glisse gcc/ChangeLog * match.pd (rotate): Simplify equality/inequality of rotations. (bswap): Simplify equality/inequality tests of byte swapping. gcc/testsuite/Chan

[x86_64 PATCH] Decrement followed by cmov improvements.

2021-07-26 Thread Roger Sayle
rovement in the MonteCarlo benchmark kernel. This patch has been tested on x86_64-pc-linux-gnu with a "make boostrap" and "make -k check" with no new failures. Ok for mainline? 2021-07-26 Roger Sayle gcc/ChangeLog * config/i386/i386.md (*dec_cmov): New define_i

[PATCH] Fold (X<

2021-07-26 Thread Roger Sayle
angeLog * match.pd (bit_ior, bit_xor): Canonicalize (X*C1)|(X*C2) and (X*C1)^(X*C2) as X*(C1+C2), and related variants, using tree_nonzero_bits to ensure that operands are bit-wise disjoint. gcc/testsuite/ChangeLog * gcc.dg/fold-ior-4.c: New test. Roger -- Roger Sayle N

[PATCH take 2] Fold (X<

2021-07-28 Thread Roger Sayle
ainline? 2021-07-28 Roger Sayle Marc Glisse gcc/ChangeLog * match.pd (bit_ior, bit_xor): Canonicalize (X*C1)|(X*C2) and (X*C1)^(X*C2) as X*(C1+C2), and related variants, using tree_nonzero_bits to ensure that operands are bit-wise disjoint. gcc/testsuite/Change

[PATCH] Optimize x ? bswap(x) : 0 in tree-ssa-phiopt

2021-07-31 Thread Roger Sayle
check" with no new failures. Ok for mainline? 2021-07-31 Roger Sayle gcc/ChangeLog * tree-ssa-phiopt.c (cond_removal_in_builtin_zero_pattern): Renamed from cond_removal_in_popcount_clz_ctz_pattern. Add support for BSWAP, FFS, PARITY and CLRSB builtins. (tre

RE: [r12-2640 Regression] FAIL: gcc.target/i386/dec-cmov-2.c scan-assembler-not test(l|q|w) on Linux/x86_64

2021-07-31 Thread Roger Sayle
these failures. Committed as obvious. 2021-07-31 Roger Sayle gcc/testsuite/ChangeLog * gcc.target/i386/dec-cmov-2.c: Require -march=core2 with -m32. Roger -- -Original Message- From: sunil.k.pandey Sent: 31 July 2021 08:13 To: gcc-patches@gcc.gnu.org; gcc-regress

[PATCH] PR target/114187: Fix ?Fmode SUBREG simplification in simplify_subreg.

2024-03-03 Thread Roger Sayle
added/modified potentially contributed to this lapse. Using lowpart_subreg should avoid/reduce confusion in future. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for ma

[x86_64 PATCH] PR target/113690: Fix-up MULT REG_EQUAL notes in STV.

2024-02-04 Thread Roger Sayle
64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-02-05 Roger Sayle gcc/ChangeLog PR target/113690 * config/i386/i386-features.cc (timode_convert_cst): New helper functi

RE: [libatomic PATCH] Fix testsuite regressions on ARM [raspberry pi].

2024-01-11 Thread Roger Sayle
ibatomic doesn't have a (multi-threaded) run-time test to search for race conditions, and confirm its implementations are correctly serializing. Please let me know what you think. Best regards, Roger -- > -Original Message- > From: Richard Earnshaw > Sent: 10 January 2024 15:34

[PATCH/RFC] Add --with-dwarf4 configure option.

2024-01-14 Thread Roger Sayle
. do the right thing. In fact, I'd originally misread the documentation and assumed --with-dwarf4 was already supported. 2024-01-14 Roger Sayle gcc/ChangeLog * configure.ac: Add a with --with dwarf4 option. * configure: Regenerate. * confi

[PATCH] PR rtl-optimization/111267: Improved forward propagation.

2024-01-15 Thread Roger Sayle
%xmm2, %xmm1 setnb %al ret .L6:xorl%eax, %eax ret This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Additionally, it also resolves the FAIL for gcc.target/

[x86 PATCH] PR target/106060: Improved SSE vector constant materialization.

2024-01-16 Thread Roger Sayle
gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-01-16 Roger Sayle gcc/ChangeLog PR target/106060 * config/i386/i386-expand.cc (enum ix86_vec_bcast_alg): New. (struct ix86_vec_bcas

[middle-end PATCH] Prefer PLUS over IOR in RTL expansion of multi-word shifts/rotates.

2024-01-18 Thread Roger Sayle
add2r1,r2,r1 j_s.d [blink] add2r0,r3,r0 This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-01-18 Roger Sayle gcc/ChangeLog

RE: [middle-end PATCH] Prefer PLUS over IOR in RTL expansion of multi-word shifts/rotates.

2024-01-19 Thread Roger Sayle
-level might lead to a code quality regression, if RTL expansion doesn't know to lower it back to use PLUS on those targets with lea but without rotate. > From: Richard Biener > Sent: 19 January 2024 11:04 > On Thu, Jan 18, 2024 at 8:55 PM Roger Sayle > wrote: > > > > T

RE: [x86 PATCH] PR target/106060: Improved SSE vector constant materialization.

2024-01-25 Thread Roger Sayle
no new failures. Ok for mainline (in stage 1)? 2024-01-25 Roger Sayle Hongtao Liu gcc/ChangeLog PR target/106060 * config/i386/i386-expand.cc (enum ix86_vec_bcast_alg): New. (struct ix86_vec_bcast_map_simode_t): New type for table below. (ix86_vec

[middle-end PATCH] Constant fold {-1,-1} << 1 in simplify-rtx.cc

2024-01-26 Thread Roger Sayle
n now checks that VEC_SELECT or some funky (future) rtx_code doesn't cause problems. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline (in stage 1)? 2024-01-26 Roger Sa

[libatomic PATCH] PR other/113336: Fix libatomic testsuite regressions on ARM.

2024-01-28 Thread Roger Sayle
This patch is a revised version of the fix for PR other/113336. This patch has been tested on arm-linux-gnueabihf with --with-arch=armv6 with make bootstrap and make -k check where it fixes all of the FAILs in libatomic. Ok for mainline? 2024-01-28 Roger Sayle Victor Do

[tree-ssa PATCH] PR target/113560: Enhance is_widening_mult_rhs_p.

2024-01-29 Thread Roger Sayle
ootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2023-01-30 Roger Sayle gcc/ChangeLog PR target/113560 * tree-ssa-math-opts.cc (is_widening_mult_rhs_p): Use range information via tree_non_zero_bits to check i

<    1   2   3   4   5   6   7   >