[x86_64 PATCH] PR target/105791: Add V1TI to V_128_256 for xop_pcmov_v1ti.

2022-06-01 Thread Roger Sayle
h make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2022-06-02 Roger Sayle gcc/ChangeLog PR target/105791 * config/i386/sse.md (V_128_256):Add V1TI and V2TI. (define_mode_attr avxsizesuffix): Add

[x86 PATCH] Add peephole2 to reduce double word register shuffling.

2022-06-02 Thread Roger Sayle
ike to add the new testcase with part 2, once we're back down to requiring only two movq instructions. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2022-06-02 Rog

[PATCH/RFC] cprop_hardreg... Third time's a charm.

2022-06-02 Thread Roger Sayle
ootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Thoughts? Ok for mainline? 2022-06-02 Roger Sayle gcc/ChangeLog * regcprop.cc (pass_cprop_hardreg::execute): Perform a third iteration over each basic block that was updated

RE: [PATCH] Fold truncations of left shifts in match.pd

2022-06-02 Thread Roger Sayle
Hi Richard, > + /* RTL expansion knows how to expand rotates using shift/or. */ if > + (icode == CODE_FOR_nothing > + && (code == LROTATE_EXPR || code == RROTATE_EXPR) > + && optab_handler (ior_optab, vec_mode) != CODE_FOR_nothing > + && optab_handler (ashl_optab, vec_mode) != C

[x86 PATCH] PR target/91681: zero_extendditi2 pattern for more optimizations.

2022-06-03 Thread Roger Sayle
he unpack mask operations, which matches what sse.md does for the other mask specific (logic) operations. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2022-06-

[PATCH/RFC take #2] cprop_hardreg... Third time's a charm.

2022-06-03 Thread Roger Sayle
this what you had in mind? 2022-06-03 Roger Sayle Richard Biener gcc/ChangeLog * regcprop.cc (pass_cprop_hardreg::execute): Perform a third iteration over each basic block that was updated by the second iteration. Cheers, Roger -- > -Origina

[x86 PATCH] Recognize vpcmov in combine with -mxop.

2022-06-04 Thread Roger Sayle
nd while there I also added rtx_costs for x86_64's integer conditional move instructions (which have single cycle latency). This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for

[C++ PATCH take #2] PR c++/96442: Improved error recovery in enumerations.

2022-06-05 Thread Roger Sayle
subdirectory as per your feedback on my previous ICE-on-invalid fixes. This patch has been tested on x86_64-pc-linunx-gnu with make bootstrap and make -k check with no new (unexpected) failures. Ok for mainline? 2022-06-05 Roger Sayle gcc/cp/ChangeLog PR c++/96442 * d

[PATCH take #2] Fold truncations of left shifts in match.pd

2022-06-05 Thread Roger Sayle
ted on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2022-06-05 Roger Sayle Richard Biener gcc/ChangeLog * match.pd (convert (lshift @1 INTEGER_CST@2)): Narrow integer

RE: [x86 PATCH] PR target/91681: zero_extendditi2 pattern for more optimizations.

2022-06-05 Thread Roger Sayle
now this patch keeps double word patterns consistent]. This revised patch has been retested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2022-06-05 Roger Sayle Uroš Bizjak

[PATCH] PR tree-optimization/105835: Two narrowing patterns for match.pd.

2022-06-05 Thread Roger Sayle
tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2022-06-05 Roger Sayle gcc/ChangeLog * match.pd (convert (mult zero_one_valued_p@1 INTEGER_CST@2)): Narrow integer mul

[x86 PATCH] Double word implementation of and; cmp to not; test optimization.

2022-06-05 Thread Roger Sayle
get/i386/pr65105-5.c now fails. Counter-intuitively, this is progress, and pr65105-5.c may now be fixed (without using peephole2) simply by tweaking the STV pass to handle andn/test (in a follow-up patch). OK for mainline? 2022-06-05 Roger Sayle gcc/ChangeLog * config/i386/i386.cc (ix86_rt

RE: [x86 PATCH] Double word implementation of and; cmp to not; test optimization.

2022-06-06 Thread Roger Sayle
Hi Uros, > > The major theme of this patch is to generalize many of i386.md's > > *di3_doubleword patterns to become *_doubleword patterns, i.e. > > whenever there exists a "double word" optimization for DImode with > > -m32, there should be an equivalent TImode optimization on TARGET_64BIT. > >

RE: [PING] PR middle-end/95126: Expand small const structs as immediate constants

2022-06-06 Thread Roger Sayle
akage. Hopefully the fix I'm testing will cure this as well (but an ICE is different symptom to a silent miscompilation). Sorry again, Roger -- > -Original Message- > From: Rainer Orth > Sent: 05 June 2022 21:31 > To: Andreas Schwab > Cc: Roger Sayle ; gcc-patches@gc

RE: [PING] PR middle-end/95126: Expand small const structs as immediate constants

2022-06-06 Thread Roger Sayle
Hi Andreas, > > gcc -std=gnu99 -c -g -gnatpg -gnatwns -gnata -W -Wall -I- -I. > > -Iada/generated -Iada -I../../gcc/gcc/ada ../../gcc/gcc/ada/osint.adb > > -o ada/osint.o > > osint.adb:438:31: "strlen" not declared in "CRTL" > > osint.adb:441:14: "strncpy" not declared in "CRTL" > > osint.adb:6

RE: [PING] PR middle-end/95126: Expand small const structs as immediate constants

2022-06-06 Thread Roger Sayle
Hi Rainer, > > The one experiment I'd like to be able to try, to investigate the > > cause/cure of this, is: > > > > diff --git a/gcc/calls.cc b/gcc/calls.cc index a4336c1..05fdd24 100644 > > --- a/gcc/calls.cc > > +++ b/gcc/calls.cc > > @@ -2177,7 +2177,7 @@ load_register_parameters (struct arg

[PATCH] PR middle-end/105853: Call store_constructor directly from calls.cc.

2022-06-06 Thread Roger Sayle
oard=unix{-m32}, OK for mainline if that also passes? My sincere apologies for the inconvenience. 2022-06-06 Roger Sayle gcc/ChangeLog PR middle-end/105853 PR target/105856 * calls.cc (load_register_parameters): Call store_constructor (and int_Expr_size)

[Committed] Add -mno-avx2 to recent gcc.target/i386/xop-vpcmov3.c

2022-06-08 Thread Roger Sayle
vl to the command line options. Committed to mainline as obvious (in hindsight). 2022-06-08 Roger Sayle gcc/testsuite/ChangeLog * gcc.target/i386/xop-pcmov3.c: Add -mno-avx512vl to dg-options. Roger -- > -Original Message- > From: skpan...@sc.intel.com > Sent: 07 J

[PATCH] PR middle-end/105874: Use EXPAND_MEMORY to fix ada bootstrap.

2022-06-08 Thread Roger Sayle
tested on x86_64-pc-linux-gnu with make bootstrap and make -k check (with no new failures), but also with --enable-languages="ada" where it allows the bootstrap to finish, and with no unexpected failures in the acats and gnat testsuites. Ok for mainline? 2022-06-08 Roger Sayle gc

[rs6000 PATCH] PR target/105991: Recognize PLUS and XOR forms of rldimi.

2022-06-16 Thread Roger Sayle
k for bootstrapping and regression testing this change without problems. Hopefully the new testcase is portable across powerpc's effective-targets. Ok for mainline? 2022-06-17 Roger Sayle Marek Polacek gcc/ChangeLog PR target/105991 * config/rs6000/rs6000.md (plus_xor)

[x86 PATCH] PR target/105930: Split *xordi3_doubleword after reload.

2022-06-22 Thread Roger Sayle
rking evaluate it, then revert the patch if there are any observed performance issues. Thoughts? 2022-06-22 Roger Sayle gcc/ChangeLog PR target/105930 * config/i386/i386.md (*di3_doubleword): Split after reload. Use rtx_equal_p to avoid creating memory-to-

[PATCH] PR tree-optimization/94026: Simplify (X>>8)&6 != 0 as X&1536 != 0.

2022-06-24 Thread Roger Sayle
tch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures, OK for mainline? 2022-06-24 Roger Sayle gcc/ChangeLog PR tree-optimization/94026 * match.pd (((X << C1) & C2) eq/

[x86_64 PATCH] Implement __imag__ of float _Complex using shufps.

2022-06-26 Thread Roger Sayle
register allocator prefers to use SSE, we split to a shufps_v4si, or if not, we use a regular shrq. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, with no new failures. Ok for mainline? 2022-06-26 Roger Sayle gcc/ChangeLog PR rtl

[x86 PATCH] PR rtl-optimization/96692: ((A|B)^C)^A using andn with -mbmi.

2022-06-26 Thread Roger Sayle
rtx_costs for "(and (not ..." (as there's no optab for andn). This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2022-06-26 Roger Sayle gcc/ChangeLog

[PATCH take 2] middle-end: Support ABIs that pass FP values as wider integers.

2022-06-26 Thread Roger Sayle
n their targets. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check with no new failures, and on nvptx-none, where it is the middle-end portion of a pair of patches to allow the default ISA to be advanced. Ok for mainline? 2022-06-26 Roger Sayle gcc/Chang

[x86 PATCH] Use xchg for DImode double word rotate by 32 bits with -m32.

2022-06-26 Thread Roger Sayle
rom 5626 to 5404. Although there's an impressive reduction in instruction count, there's no change/reduction in stack frame size. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok

[rs6000 PATCH] Improve constant integer multiply using rldimi.

2022-06-26 Thread Roger Sayle
reasonable? [I've another patch of x86 that uses the same idiom]. This patch has been tested on powerpc64le-unknown-linux-gnu with make bootstrap and make -k check with no new failures. Ok for mainline? 2022-06-26 Roger Sayle gcc/ChangeLog * config/rs6000/rs6000.md (*r

[x86 PATCH] Double word logical operation clean-ups in i386.md.

2022-06-28 Thread Roger Sayle
ew failures. Ok for mainline? 2022-06-28 Roger Sayle gcc/ChangeLog * config/i386/i386.md (general_szext_operand): Add TImode support using x86_64_hilo_general_operand predicate. (*cmp_doubleword): Use x86_64_hilo_general_operand predicate. (*add3_doubleword): I

[x86 PATCH take #2] Double word logical operation clean-ups in i386.md.

2022-06-30 Thread Roger Sayle
perand wherever we use the "r" constraint, and that's used consistently in this patch. I hope these exceptions are acceptable. The attached revised patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check both with and with --target_board=unix{-m32} with no n

[PATCH] PR rtl-optimization/46235: Improved use of bt for bit tests on x86_64.

2021-06-15 Thread Roger Sayle
by setnc with zero extension. gcc/testsuite/ChangeLog PR rtl-optimization/46235 * gcc.target/i386/bt-5.c: New test. * gcc.target/i386/bt-6.c: New test. * gcc.target/i386/bt-7.c: New test. Roger -- Roger Sayle NextMove Software Cambridge, UK diff --git a/gcc/config

[x86_64 PATCH] PR target/11877: Use xor to write zero to memory with -Os

2021-06-20 Thread Roger Sayle
and make -k check with no new failures. Ok for mainline? 2021-06-20 Roger Sayle gcc/ChangeLog PR target/11877 * config/i386/i386.md: New define_peephole2s to shrink writing 1, 2 or 4 consecutive zeros to memory when optimizing for size. gcc/testsuite/ChangeLo

[PATCH] PR target/103611: Avoid generating orb $0, %ah on x86.

2021-12-13 Thread Roger Sayle
nt splitter, either eliminating the instruction or turning it into a simple move. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without "--target_board='unix{-m32}'" with no new failures. OK for mainline? 2021-12-13 Rog

[PATCH] x86: PR target/103611: Splitter for DST:DI = (HI:SI<<32)|LO:SI.

2021-12-13 Thread Roger Sayle
seems reasonable (but this patch has been tested both with and without this last change, if it's consider controversial). This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without "--target_board='unix{-m32}'" with no ne

[PATCH take #2] x86_64: Improve code expanded for highpart multiplications.

2021-12-20 Thread Roger Sayle
(with and without RUNTESTFLAGS="--target_board='unix{-m32}'") with no new failures. Ok for mainline? 2021-12-20 Roger Sayle Uroš Bizjak gcc/ChangeLog * config/i386/i386.md (any_mul_highpart): New code iterator. (sgnprefix, s): Add attribut

[PATCH] PR target/103773: Fix wrong-code with -Oz from pop to memory.

2021-12-21 Thread Roger Sayle
make -k check with no new failures. Ok for mainline? 2021-12-21 Roger Sayle gcc/ChangeLog PR target/103773 * config/i386/i386.md (*movdi_internal): Only use short push/pop sequence for register (non-memory) destinations. (*movsi_internal): Likewise. gcc/testsuite

[PATCH] x86: Shrink writing 0/-1 to memory using and/or with -Oz.

2021-12-21 Thread Roger Sayle
res, and the new testcase checked both with and without -m32. Ok for mainline? 2021-12-21 Roger Sayle gcc/ChangeLog * gcc/config/i386/i386.md (define_peephole2): With -Oz use andl $0,mem instead of movl $0,mem and orl $-1,mem instead of movl $-1,mem.

RE: [PATCH] PR target/103773: Fix wrong-code with -Oz from pop to memory.

2021-12-22 Thread Roger Sayle
as this testing included the 0/-1 write to memory changes). Tested (overnight) on x86_64-pc-linux-gnu with make bootstrap and make -k check with no new failures. 2021-12-22 Roger Sayle gcc/ChangeLog PR target/103773 * config/i386/i386.md (*movdi_internal): Only use short

[PATCH take #3] PR target/103773: Fix wrong-code with -Oz from pop to memory.

2021-12-23 Thread Roger Sayle
c-linux-gnu with make bootstrap and make -k check with no new failures, and the new testcase checked both with and without -m32. Ok for mainline? 2021-12-23 Roger Sayle Uroš Bizjak gcc/ChangeLog PR target/103773 * config/i386/i386.md (*mov_and): New define_insn f

[PATCH] nvptx: Add support for PTX's cnot instruction.

2022-01-06 Thread Roger Sayle
64-pc-linux-gnu (including newlib) with a make and make -k check with no new failures. Ok for mainline? 2022-01-06 Roger Sayle gcc/ChangeLog * config/nvptx/nvptx.md (*cnot2): New define_insn. gcc/testsuite/ChangeLog * gcc.target/nvptx/cnot-1.c: New test case. Thanks in ad

[PATCH] x86_64: Improve (interunit) moves from TImode to V1TImode.

2022-01-06 Thread Roger Sayle
%xmm1, %xmm0 ret Hence the solution (i.e. this patch) is to add a special case to ix86_expand_vector_move for TImode to V1TImode transfers. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check with no new failures. Ok for mainline? 20

[PATCH take #3] Recognize MULT_HIGHPART_EXPR in tree-ssa-math-opts pass.

2022-01-06 Thread Roger Sayle
s working/continues to work. This patch has been tested on x86_64-pc-linux-gnu with a make bootstrap and make -k check (both with and without --target_board='unix{-m32}') with no new regressions. Ok for mainline? 2022-01-06 Roger Sayle gcc/ChangeLog * tr

[PATCH] nvptx: Improved support for HFMode including neghf2 and abshf2.

2022-01-08 Thread Roger Sayle
ath variant (or revisit this decision). This patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu (including newlib) with a make and make -k check with no new failures. Ok for mainline? 2022-01-08 Roger Sayle gcc/ChangeLog * config/nvptx/nvptx.md (*cmpf): New define_insn.

[PATCH] nvptx: Expand QI mode operations using SI mode instructions.

2022-01-10 Thread Roger Sayle
selp.u32%r38, 1, 0, %r34; and.b32 %value, %r37, %r38; This patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu (including newlib) with a make and make -k check with no new failures. Ok for mainline? 2022-01-10 Roger Sayle gcc/ChangeLog

[PATCH] x86_64: Improvements to arithmetic right shifts of V1TImode values.

2022-01-11 Thread Roger Sayle
This idiom is safe to use for shifts by 127, but that case gets handled by a two operation sequence earlier in this function. This patch has been tested on x86_64-pc-linux-gnu with a make bootstrap and make -k check with no new failures. OK for mainline? 2022-01-11 Roger Sayle gc

RE: [PATCH] x86_64: Improvements to arithmetic right shifts of V1TImode values.

2022-01-14 Thread Roger Sayle
This patch has been tested on x86_64-pc-linux-gnu with a make bootstrap and make -k check with no new failures. OK for mainline? 2022-01-14 Roger Sayle Uroš Bizjak gcc/ChangeLog * config/i386/i386-expand.c (ix86_expand_v1ti_to_ti): Use force_reg. (ix86_expand_ti_to

[PATCH] nvptx: Add support for 64-bit mul.hi (and other) instructions.

2022-01-14 Thread Roger Sayle
patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu (including newlib) with a make and make -k check with no new failures. Ok for mainline? 2022-01-14 Roger Sayle gcc/ChangeLog * config/nvptx/nvptx.md (UNSPEC_ISINF): New UNSPEC. (one_cmplbi2): New define_insn for no

[PATCH] nvptx: Fix and use BI mode logic instructions (e.g. and.pred).

2022-01-16 Thread Roger Sayle
check with no new failures. Ok for mainline? 2022-01-16 Roger Sayle gcc/ChangeLog * config/nvptx/nvptx.md (any_logic): Move code iterator earlier in machine description. (logic): Move code attribute earlier in machine description. (ilogic): New code attribut

[Committed] New test case gcc.target/avr/pr54816.c

2023-04-16 Thread Roger Sayle
PR target/54816 is now fixed on mainline. This adds a test case to check that it doesn't regress in future. Tested with a cross compiler to avr-elf. Committed as obvious. 2023-04-16 Roger Sayle gcc/testsuite/ChangeLog PR target/54816 * gcc.target/avr/pr54816.c: New

[PATCH][ARM] Use utxb rN, rM, ror #8 to implement zero_extract on armv6.

2018-01-15 Thread Roger Sayle
and suitable. [Thanks in advance and apologies for any inconvenience]. 2018-01-14 Roger Sayle * config/arm/arm.md (*arm_zeroextractsi2_8_8, *arm_signextractsi2_8_8, *arm_zeroextractsi2_8_16, *arm_signextractsi2_8_16, *arm_zeroextractsi2_16_8, *arm_signextractsi2_16_8): Ne

[PATCH] POPCOUNT folding optimizations

2018-02-09 Thread Roger Sayle
gcc.dg test cases. Many thanks In advance. Best regards, Roger -- Roger Sayle, PhD. NextMove Software Limited Innovation Centre (Unit 23), Cambridge Science Park, Cambridge, CB4 0EY 2018-02-09 Roger Sayle * fold-const.c (tree_nonzero_bits): New function. * fold-const.h (tree_no

[PATCH] PR middle-end/21137: Folding if (((int)x >> 31) & 64) into if ((int)x < 0)

2016-08-08 Thread Roger Sayle
ld-const.c. Most of this patch is the resulting re-indentation. Test on x86_64-pc-linux-gnu with "make bootstrap" and "make check" with no regressions. Ok for mainline? 2016-08-05 Roger Sayle PR middle-end/21137 * fold-const.c (fold_binary_loc) : Allow tra

[PATCH] Synchronize include/dwarf2.def with binutils

2025-02-10 Thread Roger Sayle
file by copying the definition of DW_CFA_AARCH64_negate_ra_state_with_pc from binutils, restoring the ability to build a combined source tree. Tested on x86_64-pc-linux-gnu with "make bootstrap". Ok for mainline? 2025-02-10 Roger Sayle include/Chang

RE: [PATCH] rtlanal, i386: Adjust pattern_cost and x86 constant cost [PR115910]

2025-04-03 Thread Roger Sayle
issue (where the current trunk implementation is typically more correct than GCC 14's). Thoughts? > -Original Message- > From: Jakub Jelinek > Sent: 02 April 2025 12:30 > To: Richard Biener ; Jan Hubicka ; Uros Bizjak > ; Roger Sayle ; Richard > Sandiford >

RE: [PATCH] rtlanal, i386: Adjust pattern_cost and x86 constant cost [PR115910]

2025-04-03 Thread Roger Sayle
I agree returning to the GCC 14 behaviour is the best approach given the current stage. > -Original Message- > From: Jakub Jelinek > Sent: 03 April 2025 09:16 > To: Roger Sayle > Cc: 'Richard Biener' ; 'Jan Hubicka' ; 'Uros > Bizjak' ;

<    2   3   4   5   6   7