[COMMITTED] i386: Also require TARGET_AVX512BW to generate truncv16hiv16qi2 [PR110021]

2023-05-29 Thread Uros Bizjak via Gcc-patches
gcc/ChangeLog: PR target/110021 * config/i386/i386-expand.cc (ix86_expand_vecop_qihi2): Also require TARGET_AVX512BW to generate truncv16hiv16qi2. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Uros. diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-

[PATCH] rtlanal: Change return type of predicate functions from int to bool

2023-05-29 Thread Uros Bizjak via Gcc-patches
gcc/ChangeLog: * rtl.h (rtx_addr_can_trap_p): Change return type from int to bool. (rtx_unstable_p): Ditto. (reg_mentioned_p): Ditto. (reg_referenced_p): Ditto. (reg_used_between_p): Ditto. (reg_set_between_p): Ditto. (modified_between_p): Ditto. (no_labels_between_

Re: [x86_64 PATCH] PR target/109973: CCZmode and CCCmode variants of [v]ptest.

2023-05-30 Thread Uros Bizjak via Gcc-patches
On Mon, May 29, 2023 at 8:17 PM Roger Sayle wrote: > > > This is my proposed minimal fix for PR target/109973 (hopefully suitable > for backporting) that follows Jakub Jelinek's suggestion that we introduce > CCZmode and CCCmode variants of ptest and vptest, so that the i386 > backend treats [v]pt

Re: [x86_64 PATCH] PR target/109973: CCZmode and CCCmode variants of [v]ptest.

2023-05-30 Thread Uros Bizjak via Gcc-patches
On Tue, May 30, 2023 at 9:39 AM Uros Bizjak wrote: > > On Mon, May 29, 2023 at 8:17 PM Roger Sayle > wrote: > > > > > > This is my proposed minimal fix for PR target/109973 (hopefully suitable > > for backporting) that follows Jakub Jelinek's suggestion that we introduce > > CCZmode and CCCmode

[PATCH] jump: Change return type of predicate functions from int to bool

2023-05-30 Thread Uros Bizjak via Gcc-patches
gcc/ChangeLog: * rtl.h (comparison_dominates_p): Change return type from int to bool. (condjump_p): Ditto. (any_condjump_p): Ditto. (any_uncondjump_p): Ditto. (simplejump_p): Ditto. (returnjump_p): Ditto. (eh_returnjump_p): Ditto. (onlyjump_p): Ditto. (invert_ju

Re: [PATCH] jump: Change return type of predicate functions from int to bool

2023-05-31 Thread Uros Bizjak via Gcc-patches
On Wed, May 31, 2023 at 9:17 AM Richard Biener wrote: > > On Tue, May 30, 2023 at 9:01 PM Jeff Law via Gcc-patches > wrote: > > > > > > > > On 5/30/23 08:36, Uros Bizjak via Gcc-patches wrote: > > > gcc/ChangeLog: > > > > > > * rt

Re: [PATCH] libgcc: Use initarray section type for .init_stack

2023-05-31 Thread Uros Bizjak via Gcc-patches
On Wed, May 31, 2023 at 9:40 AM Kewen.Lin wrote: > > Hi Andreas, > > on 2023/5/25 15:25, Andreas Krebbel wrote: > > On 3/20/23 07:33, Kewen.Lin wrote: > >> Hi, > >> > >> One of my workmates found there is a warning like: > >> > >> libgcc/config/rs6000/morestack.S:402: Warning: ignoring > >>

[PATCH] alias: Change return type of predicate functions from int to bool

2023-05-31 Thread Uros Bizjak via Gcc-patches
Also remove a bunch of unneeded forward declarations. gcc/ChangeLog: * rtl.h (true_dependence): Change return type from int to bool. (canon_true_dependence): Ditto. (read_dependence): Ditto. (anti_dependence): Ditto. (canon_anti_dependence): Ditto. (output_dependence): Dit

[PATCH] emit-rtl: Change return type of predicate functions from int to bool

2023-05-31 Thread Uros Bizjak via Gcc-patches
Also fix some stalled comments. gcc/ChangeLog: * rtl.h (subreg_lowpart_p): Change return type from int to bool. (active_insn_p): Ditto. (in_sequence_p): Ditto. (unshare_all_rtl): Change return type from int to void. * emit-rtl.h (mem_expr_equal_p): Change return type from int

[COMMITTED] cse: Change return type of predicate functions from int to bool

2023-06-01 Thread Uros Bizjak via Gcc-patches
Also change some function arguments to bool and remove one instance of always zero function argument. gcc/ChangeLog: * rtl.h (exp_equiv_p): Change return type from int to bool. * cse.cc (mention_regs): Change return type from int to bool and adjust function body accordingly. (exp_

Re: [PATCH] i386: Add missing vector truncate patterns [PR92658].

2023-06-02 Thread Uros Bizjak via Gcc-patches
On Fri, Jun 2, 2023 at 2:49 AM liuhongt wrote: > > Add missing insn patterns for v2si -> v2hi/v2qi and v2hi-> v2qi vector > truncate. > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? > > gcc/ChangeLog: > > PR target/92658 > * config/i386/mmx.md (truncv2

[COMMITTED] reg-stack: Change return type of predicate functions from int to bool

2023-06-02 Thread Uros Bizjak via Gcc-patches
Also change some internal variables to bool and recode handling of boolean varialbes to not use bitwise or. gcc/ChangeLog: * rtl.h (stack_regs_mentioned): Change return type from int to bool. * reg-stack.cc (struct_block_info_def): Change "done" to bool. (stack_regs_mentioned_p): Chan

Re: [x86_64 PATCH] PR target/110083: Fix-up REG_EQUAL notes on COMPARE in STV.

2023-06-04 Thread Uros Bizjak via Gcc-patches
On Sat, Jun 3, 2023 at 7:31 PM Roger Sayle wrote: > > > This patch fixes PR target/110083, an ICE-on-valid regression exposed by > my recent PTEST improvements (to address PR target/109973). The latent > bug (admittedly mine) is that the scalar-to-vector (STV) pass doesn't update > or delete REG_

Re: [x86 PATCH] Add support for stc, clc and cmc instructions in i386.md

2023-06-04 Thread Uros Bizjak via Gcc-patches
On Sun, Jun 4, 2023 at 12:45 AM Roger Sayle wrote: > > > This patch is the latest revision of my patch to add support for the > STC (set carry flag), CLC (clear carry flag) and CMC (complement > carry flag) instructions to the i386 backend, incorporating Uros' > previous feedback. The significant

[COMMITTED] reginfo: Change return type of predicate functions from int to bool

2023-06-05 Thread Uros Bizjak via Gcc-patches
gcc/ChangeLog: * rtl.h (reg_classes_intersect_p): Change return type from int to bool. (reg_class_subset_p): Ditto. * reginfo.cc (reg_classes_intersect_p): Ditto. (reg_class_subset_p): Ditto. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Uros diff --git a/gcc/re

[COMMITTED] print-rtl: Change return type of two print functions from int to void

2023-06-05 Thread Uros Bizjak via Gcc-patches
Also change one internal variable to bool. gcc/ChangeLog: * rtl.h (print_rtl_single): Change return type from int to void. (print_rtl_single_with_indent): Ditto. * print-rtl.h (class rtx_writer): Ditto. Change m_sawclose to bool. * print-rtl.cc (rtx_writer::rtx_writer): Update fo

Re: [PATCH] Fold _mm{, 256, 512}_abs_{epi8, epi16, epi32, epi64} into gimple ABSU_EXPR + VCE.

2023-06-06 Thread Uros Bizjak via Gcc-patches
On Tue, Jun 6, 2023 at 6:33 AM liuhongt via Gcc-patches wrote: > > r14-1145 fold the intrinsics into gimple ABS_EXPR which has UB for > TYPE_MIN, but PABSB will store unsigned result into dst. The patch > uses ABSU_EXPR + VCE instead of ABS_EXPR. > > Also don't fold _mm_abs_{pi8,pi16,pi32} w/o TAR

Re: [PATCH] Fold _mm{, 256, 512}_abs_{epi8, epi16, epi32, epi64} into gimple ABSU_EXPR + VCE.

2023-06-06 Thread Uros Bizjak via Gcc-patches
On Tue, Jun 6, 2023 at 6:33 AM liuhongt via Gcc-patches wrote: > > r14-1145 fold the intrinsics into gimple ABS_EXPR which has UB for > TYPE_MIN, but PABSB will store unsigned result into dst. The patch > uses ABSU_EXPR + VCE instead of ABS_EXPR. > > Also don't fold _mm_abs_{pi8,pi16,pi32} w/o TAR

Re: [PATCH] Fold _mm{, 256, 512}_abs_{epi8, epi16, epi32, epi64} into gimple ABSU_EXPR + VCE.

2023-06-06 Thread Uros Bizjak via Gcc-patches
On Tue, Jun 6, 2023 at 1:42 PM Hongtao Liu wrote: > > On Tue, Jun 6, 2023 at 5:11 PM Uros Bizjak wrote: > > > > On Tue, Jun 6, 2023 at 6:33 AM liuhongt via Gcc-patches > > wrote: > > > > > > r14-1145 fold the intrinsics into gimple ABS_EXPR which has UB for > > > TYPE_MIN, but PABSB will store u

[COMMITTED] reload1: Change return type of predicate function from int to bool

2023-06-06 Thread Uros Bizjak via Gcc-patches
gcc/ChangeLog: * rtl.h (function_invariant_p): Change return type from int to bool. * reload1.cc (function_invariant_p): Change return type from int to bool and adjust function body accordingly. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Uros. diff --git a/gcc/re

Re: [x86 PATCH] Add support for stc, clc and cmc instructions in i386.md

2023-06-06 Thread Uros Bizjak via Gcc-patches
On Tue, Jun 6, 2023 at 5:14 PM Roger Sayle wrote: > > > Hi Uros, > This revision implements your suggestions/refinements. (i) Avoid the > UNSPEC_CMC by using the canonical RTL idiom for *x86_cmc, (ii) Use > peephole2s to convert x86_stc and *x86_cmc into alternate forms on > TARGET_SLOW_STC CPUs (

Re: [x86 PATCH] Add support for stc, clc and cmc instructions in i386.md

2023-06-06 Thread Uros Bizjak via Gcc-patches
On Tue, Jun 6, 2023 at 11:00 PM Roger Sayle wrote: > > > Hi Uros, > Might you willing to approve the patch without the *x86_clc pieces? > These can be submitted later, when they are actually used. For now, > we're arguing about the performance of a pattern that's not yet > generated on an obsolet

Re: [x86 PATCH] PR target/31985: Improve memory operand use with doubleword add.

2023-06-06 Thread Uros Bizjak via Gcc-patches
On Wed, Jun 7, 2023 at 1:05 AM Roger Sayle wrote: > > > This patch addresses the last remaining issue with PR target/31985, that > GCC could make better use of memory addressing modes when implementing > double word addition. This is achieved by adding a define_insn_and_split > that combines an *

Re: [x86 PATCH] PR target/31985: Improve memory operand use with doubleword add.

2023-06-07 Thread Uros Bizjak via Gcc-patches
On Wed, Jun 7, 2023 at 8:32 AM Uros Bizjak wrote: > > On Wed, Jun 7, 2023 at 1:05 AM Roger Sayle wrote: > > > > > > This patch addresses the last remaining issue with PR target/31985, that > > GCC could make better use of memory addressing modes when implementing > > double word addition. This i

Re: [PATCH] New finish_compare_by_pieces target hook (for x86).

2023-06-12 Thread Uros Bizjak via Gcc-patches
On Mon, Jun 12, 2023 at 4:03 PM Roger Sayle wrote: > > > The following simple test case, from PR 104610, shows that memcmp () == 0 > can result in some bizarre code sequences on x86. > > int foo(char *a) > { > static const char t[] = "0123456789012345678901234567890"; > return __builtin_me

Re: Patch ping (Re: [PATCH] middle-end, i386: Pattern recognize add/subtract with carry [PR79173])

2023-06-13 Thread Uros Bizjak via Gcc-patches
On Tue, Jun 13, 2023 at 9:06 AM Jakub Jelinek wrote: > > Hi! > > On Tue, Jun 06, 2023 at 11:42:07PM +0200, Jakub Jelinek via Gcc-patches wrote: > > The following patch introduces {add,sub}c5_optab and pattern recognizes > > various forms of add with carry and subtract with carry/borrow, see > > pr

Re: [x86 PATCH] Convert ptestz of pandn into ptestc.

2023-06-14 Thread Uros Bizjak via Gcc-patches
On Tue, Jun 13, 2023 at 6:03 PM Roger Sayle wrote: > > > This patch is the next instalment in a set of backend patches around > improvements to ptest/vptest. A previous patch optimized the sequence > t=pand(x,y); ptestz(t,t) into the equivalent ptestz(x,y), using the > property that ZF is set to

[PATCH] RTL: Merge rtx_equal_p and hash_rtx functions with their callback variants

2023-06-14 Thread Uros Bizjak via Gcc-patches
Use default argument when callback function is not required to merge rtx_equal_p and hash_rtx functions with their callback variants. gcc/ChangeLog: * cse.cc (hash_rtx_cb): Rename to hash_rtx. (hash_rtx): Remove. * early-remat.cc (remat_candidate_hasher::equal): Update to call rtx

Re: [PATCH] middle-end, i386, v3: Pattern recognize add/subtract with carry [PR79173]

2023-06-14 Thread Uros Bizjak via Gcc-patches
On Wed, Jun 14, 2023 at 4:00 PM Jakub Jelinek wrote: > > Hi! > > On Wed, Jun 14, 2023 at 12:35:42PM +, Richard Biener wrote: > > At this point two pages of code without a comment - can you introduce > > some vertical spacing and comments as to what is matched now? The > > split out functions

Re: [PATCH] middle-end, i386, v3: Pattern recognize add/subtract with carry [PR79173]

2023-06-14 Thread Uros Bizjak via Gcc-patches
On Wed, Jun 14, 2023 at 4:00 PM Jakub Jelinek wrote: > > Hi! > > On Wed, Jun 14, 2023 at 12:35:42PM +, Richard Biener wrote: > > At this point two pages of code without a comment - can you introduce > > some vertical spacing and comments as to what is matched now? The > > split out functions

Re: [PATCH] middle-end, i386, v3: Pattern recognize add/subtract with carry [PR79173]

2023-06-14 Thread Uros Bizjak via Gcc-patches
On Wed, Jun 14, 2023 at 4:56 PM Jakub Jelinek wrote: > > On Wed, Jun 14, 2023 at 04:34:27PM +0200, Uros Bizjak wrote: > > LGTM for the x86 part. I did my best, but those peephole2 patterns are > > real PITA to be reviewed thoroughly. > > > > Maybe split out peephole2 pack to a separate patch, foll

Re: [PATCH] x86: correct and improve "*vec_dupv2di"

2023-06-15 Thread Uros Bizjak via Gcc-patches
On Thu, Jun 15, 2023 at 8:03 AM Jan Beulich via Gcc-patches wrote: > > The input constraint for the %vmovddup alternative was wrong, as the > upper 16 XMM registers require AVX512VL to be used with this insn. To > compensate, introduce a new alternative permitting all 32 registers, by > broadcasti

Re: [PATCH] x86: correct and improve "*vec_dupv2di"

2023-06-15 Thread Uros Bizjak via Gcc-patches
On Thu, Jun 15, 2023 at 10:15 AM Jan Beulich wrote: > > On 15.06.2023 09:45, Hongtao Liu wrote: > > On Thu, Jun 15, 2023 at 3:07 PM Uros Bizjak via Gcc-patches > > wrote: > >> On Thu, Jun 15, 2023 at 8:03 AM Jan Beulich via Gcc-patches > >> wrote: &g

Re: [x86 PATCH] PR target/31985: Improve memory operand use with doubleword add.

2023-06-16 Thread Uros Bizjak via Gcc-patches
On Fri, Jun 16, 2023 at 12:04 AM Roger Sayle wrote: > > > Hi Uros, > > > On the 7th June 2023, Uros Bizkak wrote: > > The register allocator considers the instruction-to-be-split as one > > instruction, so it > > can allocate output register to match an input register (or a register that > > for

Re: [x86 PATCH] PR target/110598: Fix rega = 0; rega ^= rega regression.

2023-07-12 Thread Uros Bizjak via Gcc-patches
On Tue, Jul 11, 2023 at 9:07 PM Roger Sayle wrote: > > > This patch fixes the regression PR target/110598 caused by my recent > addition of a peephole2. The intention of that optimization was to > simplify zeroing a register, followed by an IOR, XOR or PLUS operation > on it into a move, or as de

Re: [x86 PATCH] Fix FAIL of gcc.target/i386/pr91681-1.c

2023-07-12 Thread Uros Bizjak via Gcc-patches
On Tue, Jul 11, 2023 at 10:07 PM Roger Sayle wrote: > > > The recent change in TImode parameter passing on x86_64 results in the > FAIL of pr91681-1.c. The issue is that with the extra flexibility, > the combine pass is now spoilt for choice between using either the > *add3_doubleword_concat or t

Re: [PATCH] simplify-rtx: Fix invalid simplification with paradoxical subregs [PR110206]

2023-07-12 Thread Uros Bizjak via Gcc-patches
t;> > On Mon, Jul 10, 2023 at 11:26 AM Uros Bizjak wrote: > >> > > > >> > > On Mon, Jul 10, 2023 at 11:17 AM Richard Biener > >> > > wrote: > >> > > > > >> > > > On Sun, Jul 9, 2023 at 10:53 AM Uros Bizjak vi

Re: [PATCH] simplify-rtx: Fix invalid simplification with paradoxical subregs [PR110206]

2023-07-12 Thread Uros Bizjak via Gcc-patches
t;> On Mon, Jul 10, 2023 at 11:47 AM Richard Biener > > >> wrote: > > >> > > > >> > On Mon, Jul 10, 2023 at 11:26 AM Uros Bizjak wrote: > > >> > > > > >> > > On Mon, Jul 10, 2023 at 11:17 AM Richard

[committed] ifcvt: Change return type of predicate functions from int to bool

2023-07-12 Thread Uros Bizjak via Gcc-patches
Also change some internal variables and function arguments from int to bool. gcc/ChangeLog: * ifcvt.cc (cond_exec_changed_p): Change variable to bool. (last_active_insn): Change "skip_use_p" function argument to bool. (noce_operand_ok): Change return type from int to bool. (find_c

[committed] IRA+LRA: Change return type of predicate functions from int to bool

2023-07-12 Thread Uros Bizjak via Gcc-patches
gcc/ChangeLog: * ira.cc (equiv_init_varies_p): Change return type from int to bool and adjust function body accordingly. (equiv_init_movable_p): Ditto. (memref_used_between_p): Ditto. * lra-constraints.cc (valid_address_p): Ditto. Bootstrapped and regression tested on x86_64-l

[committed] alpha: Fix computation mode in alpha_emit_set_long_cost [PR106966]

2023-07-13 Thread Uros Bizjak via Gcc-patches
PR target/106966 gcc/ChangeLog: * config/alpha/alpha.cc (alpha_emit_set_long_const): Always use DImode when constructing long const. gcc/testsuite/ChangeLog: * gcc.target/alpha/pr106966.c: New test. Bootstrapped and regression tested by Matthias on alpha-linux-gnu. Uros. diff

Re: [x86 PATCH] PR target/110588: Add *bt_setncqi_2 to generate btl

2023-07-13 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 13, 2023 at 7:10 PM Roger Sayle wrote: > > > This patch resolves PR target/110588 to catch another case in combine > where the i386 backend should be generating a btl instruction. This adds > another define_insn_and_split to recognize the RTL representation for this > case. > > I also

[PATCH] cprop: Do not set REG_EQUAL note when simplifying paradoxical subreg [PR110206]

2023-07-13 Thread Uros Bizjak via Gcc-patches
cprop1 pass does not consider paradoxical subreg and for (insn 22) claims that it equals 8 elements of HImodeby setting REG_EQUAL note: (insn 21 19 22 4 (set (reg:V4QI 98) (mem/u/c:V4QI (symbol_ref/u:DI ("*.LC1") [flags 0x2]) [0 S4 A32])) "pr110206.c":12:42 1530 {*movv4qi_internal} (

Re: [PATCH] i386: Auto vectorize usdot_prod, udot_prod with AVXVNNIINT16 instruction.

2023-07-14 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 14, 2023 at 8:24 AM Haochen Jiang wrote: > > Hi all, > > This patch aims to auto vectorize usdot_prod and udot_prod with newly > introduced AVX-VNNI-INT16. > > Also I refined the redundant mode iterator in the patch. > > Regtested on x86_64-pc-linux-gnu. Ok for trunk after AVX-VNNI-INT

Re: [x86_64 PATCH] Improved insv of DImode/DFmode {high,low}parts into TImode.

2023-07-14 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 13, 2023 at 6:45 PM Roger Sayle wrote: > > > This is the next piece towards a fix for (the x86_64 ABI issues affecting) > PR 88873. This patch generalizes the recent tweak to ix86_expand_move > for setting the highpart of a TImode reg from a DImode source using > *insvti_highpart_1, t

Re: [PATCH] cprop: Do not set REG_EQUAL note when simplifying paradoxical subreg [PR110206]

2023-07-14 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 14, 2023 at 10:31 AM Richard Biener wrote: > > On Fri, 14 Jul 2023, Uros Bizjak wrote: > > > cprop1 pass does not consider paradoxical subreg and for (insn 22) claims > > that it equals 8 elements of HImodeby setting REG_EQUAL note: > > > > (insn 21 19 22 4 (set (reg:V4QI 98) > >

Re: [PATCH] cprop: Do not set REG_EQUAL note when simplifying paradoxical subreg [PR110206]

2023-07-14 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 14, 2023 at 10:53 AM Richard Biener wrote: > > On Fri, 14 Jul 2023, Uros Bizjak wrote: > > > On Fri, Jul 14, 2023 at 10:31?AM Richard Biener wrote: > > > > > > On Fri, 14 Jul 2023, Uros Bizjak wrote: > > > > > > > cprop1 pass does not consider paradoxical subreg and for (insn 22) > >

Re: [x86 PATCH] PR target/110588: Add *bt_setncqi_2 to generate btl

2023-07-14 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 14, 2023 at 11:27 AM Roger Sayle wrote: > > > > From: Uros Bizjak > > Sent: 13 July 2023 19:21 > > > > On Thu, Jul 13, 2023 at 7:10 PM Roger Sayle > > wrote: > > > > > > This patch resolves PR target/110588 to catch another case in combine > > > where the i386 backend should be gener

Re: [PATCH] x86: replace "extendhfdf2" expander

2023-07-14 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 14, 2023 at 11:44 AM Jan Beulich wrote: > > The corresponding insn serves this purpose quite fine, and leads to > slightly less (generated) code. All we need is the insn to not have a > leading * in its name, while retaining that * for "extendhfsf2". > Introduce a mode attribute in exc

Re: [PATCH] Add peephole to eliminate redundant comparison after cmpccxadd.

2023-07-17 Thread Uros Bizjak via Gcc-patches
On Mon, Jul 17, 2023 at 8:44 AM Hongtao Liu wrote: > > Ping. > > On Tue, Jul 11, 2023 at 5:16 PM liuhongt via Gcc-patches > wrote: > > > > Similar like we did for CMPXCHG, but extended to all > > ix86_comparison_int_operator since CMPCCXADD set EFLAGS exactly same > > as CMP. > > > > When operand

Re: [PATCH 1/2] [i386] Support type _Float16/__bf16 independent of SSE2.

2023-07-17 Thread Uros Bizjak via Gcc-patches
On Mon, Jul 17, 2023 at 10:28 AM Hongtao Liu wrote: > > I'd like to ping for this patch (only patch 1/2, for patch 2/2, I > think that may not be necessary). > > On Mon, May 15, 2023 at 9:20 AM Hongtao Liu wrote: > > > > ping. > > > > On Fri, Apr 21, 2023 at 9:55 PM liuhongt wrote: > > > > > > >

[committed] combine: Change return type of predicate functions from int to bool

2023-07-17 Thread Uros Bizjak via Gcc-patches
Also change some internal variables and function arguments from int to bool. gcc/ChangeLog: * combine.cc (struct reg_stat_type): Change last_set_invalid to bool. (cant_combine_insn_p): Change return type from int to bool and adjust function body accordingly. (can_combine_p): Ditto

[committed] dwarf2: Change return type of predicate functions from int to bool

2023-07-18 Thread Uros Bizjak via Gcc-patches
Also change some internal variables and function arguments from int to bool. gcc/ChangeLog: * dwarf2asm.cc: Change FALSE to false. * dwarf2cfi.cc (execute_dwarf2_frame): Change return type to void. * dwarf2out.cc (matches_main_base): Change return type from int to bool. Change "l

Re: [GCC 13 PATCH] PR target/109973: CCZmode and CCCmode variants of [v]ptest.

2023-07-19 Thread Uros Bizjak via Gcc-patches
On Wed, Jul 19, 2023 at 2:21 PM Richard Biener wrote: > > On Sun, Jun 11, 2023 at 12:55 AM Roger Sayle > wrote: > > > > > > This is a backport of the fixes for PR target/109973 and PR target/110083. > > > > This backport to the releases/gcc-13 branch has been tested on > > x86_64-pc-linux-gnu wi

Re: [x86_64 PATCH] More TImode parameter passing improvements.

2023-07-19 Thread Uros Bizjak via Gcc-patches
On Wed, Jul 19, 2023 at 10:07 PM Roger Sayle wrote: > > > This patch is the next piece of a solution to the x86_64 ABI issues in > PR 88873. This splits the *concat3_3 define_insn_and_split > into two patterns, a TARGET_64BIT *concatditi3_3 and a !TARGET_64BIT > *concatsidi3_3. This allows us to

Re: [x86_64 PATCH] More TImode parameter passing improvements.

2023-07-20 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 20, 2023 at 9:44 AM Roger Sayle wrote: > > > Hi Uros, > > > From: Uros Bizjak > > Sent: 20 July 2023 07:50 > > > > On Wed, Jul 19, 2023 at 10:07 PM Roger Sayle > > wrote: > > > > > > This patch is the next piece of a solution to the x86_64 ABI issues in > > > PR 88873. This splits t

Re: [PATCH] Optimize vlddqu to vmovdqu for TARGET_AVX

2023-07-20 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 20, 2023 at 9:35 AM liuhongt wrote: > > For Intel processors, after TARGET_AVX, vmovdqu is optimized as fast > as vlddqu, UNSPEC_LDDQU can be removed to enable more optimizations. > Can someone confirm this with AMD folks? > If AMD doesn't like such optimization, I'll put my optimizati

[committed] i386: Double-word sign-extension missed-optimization [PR110717]

2023-07-20 Thread Uros Bizjak via Gcc-patches
When sign-extending the value in a double-word register pair using shift and ashiftrt sequence with the same count immediate value less than word width, there is no need to shift the lower word of the value. The sign-extension could be limited to the upper word, but we uselessly shift the lower wor

Re: [x86 PATCH] Use QImode for offsets in zero_extract/sign_extract in i386.md

2023-07-22 Thread Uros Bizjak via Gcc-patches
On Sat, Jul 22, 2023 at 5:37 PM Roger Sayle wrote: > > > As suggested by Uros, this patch changes the ZERO_EXTRACTs and SIGN_EXTRACTs > in i386.md to consistently use QImode for bit offsets (i.e. third and fourth > operands), matching the use of QImode for bit counts in shifts and rotates. > > The

Re: [x86 PATCH] Don't use insvti_{high, low}part with -O0 (for compile-time).

2023-07-22 Thread Uros Bizjak via Gcc-patches
On Sat, Jul 22, 2023 at 4:17 PM Roger Sayle wrote: > > > This patch attempts to help with PR rtl-optimization/110587, a regression > of -O0 compile time for the pathological pr28071.c. My recent patch helps > a bit, but hasn't returned -O0 compile-time to where it was before my > ix86_expand_move

[committed] i386: Clear upper half of XMM register for V2SFmode operations [PR110762]

2023-07-26 Thread Uros Bizjak via Gcc-patches
Clear the upper half of a V4SFmode operand register in front of all potentially trapping instructions. The testcase: --cut here-- typedef float v2sf __attribute__((vector_size(8))); typedef float v4sf __attribute__((vector_size(16))); v2sf test(v4sf x, v4sf y) { v2sf x2, y2; x2 = __builtin_s

[committed] testsuite: Fix gfortran.dg/ieee/comparisons_3.F90 testsuite failures

2023-07-26 Thread Uros Bizjak via Gcc-patches
The testcase should use dg-additional-options instead of dg-options to not overwrite default compile flags that include path for finding the IEEE modules. gcc/testsuite/ChangeLog: * gfortran.dg/ieee/comparisons_3.F90: Use dg-additional-options instead of dg-options. Tested on x86_64-linu

[RFC PATCH] i386: Do not sanitize upper part of V2SFmode reg with -fno-trapping-math [PR110832]

2023-07-30 Thread Uros Bizjak via Gcc-patches
Also introduce -m[no-]mmxfp-with-sse option to disable trapping V2SF named patterns in order to avoid generation of partial vector V4SFmode trapping instructions. The new option is enabled by default, because even with sanitization, a small but consistent speed up of 2 to 3% with Polyhedron capaci

Re: [RFC PATCH] i386: Do not sanitize upper part of V2SFmode reg with -fno-trapping-math [PR110832]

2023-07-31 Thread Uros Bizjak via Gcc-patches
On Mon, Jul 31, 2023 at 11:40 AM Richard Biener wrote: > > On Sun, 30 Jul 2023, Uros Bizjak wrote: > > > Also introduce -m[no-]mmxfp-with-sse option to disable trapping V2SF > > named patterns in order to avoid generation of partial vector V4SFmode > > trapping instructions. > > > > The new option

Re: [PATCH] Optimize vlddqu + inserti128 to vbroadcasti128

2023-08-01 Thread Uros Bizjak via Gcc-patches
On Wed, Aug 2, 2023 at 3:33 AM liuhongt wrote: > > In [1], I propose a patch to generate vmovdqu for all vlddqu intrinsics > after AVX2, it's rejected as > > The instruction is reachable only as __builtin_ia32_lddqu* (aka > > _mm_lddqu_si*), so it was chosen by the programmer for a reason. I > > t

Re: [x86 PATCH] PR target/106122: Don't update %esp via the stack with -Oz.

2022-06-30 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 1, 2022 at 1:00 AM Roger Sayle wrote: > > > When optimizing for size with -Oz, setting a register can be minimized by > pushing an immediate value to the stack and popping it to the destination. > Alas the one general register that shouldn't be updated via the stack is > the stack poin

Re: [Committed] Add constraints to new andn_doubleword_bmi pattern in i386.md.

2022-07-01 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 1, 2022 at 12:17 PM Roger Sayle wrote: > > > Many thanks to Uros for spotting that I'd forgotten to add constraints > to the new define_insn_and_split *andn_doubleword_bmi when moving it > from pre-reload to post-reload. I've pushed this obvious fix after a > make bootstrap on x86_64-

[PATCH] i386: Use "r" constraint in *andn3_doubleword_bmi

2022-07-01 Thread Uros Bizjak via Gcc-patches
ANDN is non-destructive, so use "r" instead of "0" for its operand 1 constraint. 2022-07-01 Uroš Bizjak gcc/ChangeLog: * config/i386/i386.md (*andn3_doubleword_bmi): Use "r" constraint for operand 1. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Pushed to master.

Re: [PATCH] x86: Support 2/4/8 byte constant vector stores

2022-07-01 Thread Uros Bizjak via Gcc-patches
On Thu, Jun 30, 2022 at 4:50 PM H.J. Lu wrote: > > 1. Add a predicate for constant vectors which can be converted to integer > constants suitable for constant integer stores. For a 8-byte constant > vector, the converted 64-bit integer must be valid for store with 64-bit > immediate, which is a 6

Re: [PATCH] i386: Extend cvtps2pd to memory

2022-07-03 Thread Uros Bizjak via Gcc-patches
On Mon, Jul 4, 2022 at 7:10 AM Jiang, Haochen wrote: > > Hi all, > > I revised my patch according to all your reviews. > > Regtested on x86_64-pc-linux-gnu. OK. Thanks, Uros. > > BRs, > Haochen > > > -Original Message- > > From: Liu, Hongtao > > Sent: Thursday, June 30, 2022 4:57 PM >

Re: [x86 PATCH] PR rtl-optimization/96692: ((A|B)^C)^A using andn with -mbmi.

2022-07-04 Thread Uros Bizjak via Gcc-patches
On Mon, Jul 4, 2022 at 7:27 PM Roger Sayle wrote: > > > Hi Uros, > Thanks for the review. This patch implements all of your suggestions, both > removing ix86_pre_reload_split from the combine splitter(s), and dividing > the original splitter up into four simpler variants, that use match_dup to >

Re: [x86 PATCH take #2] Doubleword version of and; cmp to not; test optimization.

2022-07-05 Thread Uros Bizjak via Gcc-patches
On Mon, Jul 4, 2022 at 9:11 PM Roger Sayle wrote: > > > This patch is the latest revision of the patch originally posted at: > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596201.html > > This patch extends the earlier and;cmp to not;test optimization to also > perform this transformation f

Re: [x86 PATCH take #2] Doubleword version of and; cmp to not; test optimization.

2022-07-05 Thread Uros Bizjak via Gcc-patches
On Tue, Jul 5, 2022 at 9:56 AM Uros Bizjak wrote: > > On Mon, Jul 4, 2022 at 9:11 PM Roger Sayle wrote: > > > > > > This patch is the latest revision of the patch originally posted at: > > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596201.html > > > > This patch extends the earlier and;c

Re: [PATCH] i386: Handle memory operand for direct call to cvtps2pd in unpack

2022-07-06 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 7, 2022 at 7:52 AM Haochen Jiang wrote: > > Hi all, > > This patch aim to fix the ICE for vec unpack using for memory after the commit > r13-1418 on inproper insn of cvtps2pd. > > Regtested on x86_64-pc-linux-gnu. Ok for trunk? > > BRs, > Haochen > > gcc/ChangeLog: > > PR targe

Re: [x86 PATCH] Support *testdi_not_doubleword during STV pass.

2022-07-07 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 7, 2022 at 6:41 PM Roger Sayle wrote: > > > This patch fixes the current two FAILs of pr65105-5.c on x86 when > compiled with -m32. These (temporary) breakages were fallout from my > patches to improve/upgrade (scalar) double word comparisons. > On mainline, the i386 backend currently

Re: [x86 PATCH] Fun with flags: Adding stc/clc instructions to i386.md.

2022-07-08 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 8, 2022 at 9:15 AM Roger Sayle wrote: > > > This patch adds support for x86's single-byte encoded stc (set carry flag) > and clc (clear carry flag) instructions to i386.md. > > The motivating example is the simple code snippet: > > unsigned int foo (unsigned int a, unsigned int b, unsi

Re: [x86 PATCH] Fun with flags: Adding stc/clc instructions to i386.md.

2022-07-08 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 8, 2022 at 9:15 AM Roger Sayle wrote: > > > This patch adds support for x86's single-byte encoded stc (set carry flag) > and clc (clear carry flag) instructions to i386.md. > > The motivating example is the simple code snippet: > > unsigned int foo (unsigned int a, unsigned int b, unsi

Re: [gcc12 backport] PR target/105930: Split *xordi3_doubleword after reload on x86.

2022-07-09 Thread Uros Bizjak via Gcc-patches
On Sat, Jul 9, 2022 at 11:26 AM Roger Sayle wrote: > > > This is a backport of the fix for PR target/105930 from mainline to the > gcc12 release branch. This patch has been retested against the gcc12 > branch on x86_64-pc-linux-gnu with make bootstrap and make -k check, > both with and without --

Re: [x86_64 PATCH] Improved Scalar-To-Vector (STV) support for TImode to V1TImode.

2022-07-10 Thread Uros Bizjak via Gcc-patches
On Sat, Jul 9, 2022 at 2:17 PM Roger Sayle wrote: > > > This patch upgrades x86_64's scalar-to-vector (STV) pass to more > aggressively transform 128-bit scalar TImode operations into vector > V1TImode operations performed on SSE registers. TImode functionality > already exists in STV, but only f

Re: [x86_64 PATCH] Improved Scalar-To-Vector (STV) support for TImode to V1TImode.

2022-07-10 Thread Uros Bizjak via Gcc-patches
On Sun, Jul 10, 2022 at 8:36 PM Roger Sayle wrote: > > > Hi Uros, > Yes, I agree. I think it makes sense to have a single STV pass (after > combine and before reload). Let's hear what HJ thinks, but I'm > happy to investigate a follow-up patch that unifies the STV passes. > But it'll be easier t

Re: [PATCH] Allocate general register(memory/immediate) for 16/32/64-bit vector bit_op patterns.

2022-07-11 Thread Uros Bizjak via Gcc-patches
On Mon, Jul 11, 2022 at 3:15 AM liuhongt wrote: > > And split it to GPR-version instruction after reload. > > This will enable below optimization for 16/32/64-bit vector bit_op > > - movd(%rdi), %xmm0 > - movd(%rsi), %xmm1 > - pand%xmm1, %xmm0 > - movd%xmm0,

Re: [PATCH] Allocate general register(memory/immediate) for 16/32/64-bit vector bit_op patterns.

2022-07-12 Thread Uros Bizjak via Gcc-patches
On Tue, Jul 12, 2022 at 8:37 AM Hongtao Liu wrote: > > On Mon, Jul 11, 2022 at 4:03 PM Uros Bizjak via Gcc-patches > wrote: > > > > On Mon, Jul 11, 2022 at 3:15 AM liuhongt wrote: > > > > > > And split it to GPR-version instruction after reload. > >

Re: [PATCH] Extend 64-bit vector bit_op patterns with ?r alternative

2022-07-14 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 14, 2022 at 7:33 AM liuhongt wrote: > > And split it to GPR-version instruction after reload. > > > ?r was introduced under the assumption that we want vector values > > mostly in vector registers. Currently there are no instructions with > > memory or immediate operand, so that made s

Re: [PATCH] Extend 64-bit vector bit_op patterns with ?r alternative

2022-07-14 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 14, 2022 at 11:32 AM Hongtao Liu wrote: > > On Thu, Jul 14, 2022 at 3:22 PM Uros Bizjak via Gcc-patches > wrote: > > > > On Thu, Jul 14, 2022 at 7:33 AM liuhongt wrote: > > > > > > And split it to GPR-version instruction after reload. &g

Re: [PATCH] PR target/106278: Keep REG_EQUAL notes consistent during TImode STV.

2022-07-14 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 14, 2022 at 6:58 PM Roger Sayle wrote: > > > This patch resolves PR target/106278 a regression on x86_64 caused by my > recent TImode STV improvements. Now that TImode STV can handle comparisons > such as "(set (regs:CC) (compare:CC (reg:TI) ...))" the convert_insn method > sensibly c

Re: [x86 PATCH] PR target/106273: Add earlyclobber to *andn3_doubleword_bmi

2022-07-15 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 15, 2022 at 3:28 PM Roger Sayle wrote: > > > > This patch resolves PR target/106273 which is a wrong code regression > > caused by the recent reorganization to split doubleword operations after > > reload on x86. For the failing test case, the constraints on the > > andnti3_doubleword

Re: [x86 PATCH] Fix issue with x86_64_const_vector_operand predicate.

2022-07-17 Thread Uros Bizjak via Gcc-patches
On Sat, Jul 16, 2022 at 2:06 PM Roger Sayle wrote: > > > This patch fixes (what I believe is) a latent bug in i386.md's > x86_64_const_vector_operand define_predicate. According to the > documentation, when a predicate is called with rtx operand OP and > machine_mode operand MODE, we can't should

Re: [x86_64 PATCH] PR target/106231: Optimize (any_extend:DI (ctz:SI ...)).

2022-07-17 Thread Uros Bizjak via Gcc-patches
On Sat, Jul 16, 2022 at 9:10 PM Roger Sayle wrote: > > > This patch resolves PR target/106231 by providing insns that recognize > (zero_extend:DI (ctz:SI ...)) and (sign_extend:DI (ctz:SI ...)). The > result of ctz:SI is always between 0 and 32 (or undefined), so > sign_extension is the same as z

Re: [PATCH] Extend 16/32-bit vector bit_op patterns with (m,0,i)(vertical) alternative.

2022-07-17 Thread Uros Bizjak via Gcc-patches
On Mon, Jul 18, 2022 at 3:59 AM liuhongt wrote: > > And split it after reload. > > >IMO, the only case it is worth adding is a direct immediate store to > >memory, which HJ recently added. > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? > > gcc/ChangeLog: > >

Re: [PATCH V2] Extend 16/32-bit vector bit_op patterns with (m, 0, i) alternative.

2022-07-18 Thread Uros Bizjak via Gcc-patches
On Tue, Jul 19, 2022 at 8:07 AM liuhongt wrote: > > And split it after reload. > > > You will need ix86_binary_operator_ok insn constraint here with > > corresponding expander using ix86_fixup_binary_operands_no_copy to > > prepare insn operands. > Split define_expand with just register_operand, a

Re: [PATCH V2] Extend 16/32-bit vector bit_op patterns with (m, 0, i) alternative.

2022-07-19 Thread Uros Bizjak via Gcc-patches
On Tue, Jul 19, 2022 at 8:56 AM Hongtao Liu wrote: > > On Tue, Jul 19, 2022 at 2:35 PM Uros Bizjak via Gcc-patches > wrote: > > > > On Tue, Jul 19, 2022 at 8:07 AM liuhongt wrote: > > > > > > And split it after reload. > > > > > > >

Re: [PATCH V2] Extend 16/32-bit vector bit_op patterns with (m, 0, i) alternative.

2022-07-19 Thread Uros Bizjak via Gcc-patches
On Wed, Jul 20, 2022 at 4:37 AM Hongtao Liu wrote: > > On Tue, Jul 19, 2022 at 5:37 PM Uros Bizjak wrote: > > > > On Tue, Jul 19, 2022 at 8:56 AM Hongtao Liu wrote: > > > > > > On Tue, Jul 19, 2022 at 2:35 PM Uros Bizjak via Gcc-patches > > > wrote:

Re: [PATCH V2] Extend 16/32-bit vector bit_op patterns with (m, 0, i) alternative.

2022-07-19 Thread Uros Bizjak via Gcc-patches
On Wed, Jul 20, 2022 at 8:14 AM Uros Bizjak wrote: > > On Wed, Jul 20, 2022 at 4:37 AM Hongtao Liu wrote: > > > > On Tue, Jul 19, 2022 at 5:37 PM Uros Bizjak wrote: > > > > > > On Tue, Jul 19, 2022 at 8:56 AM Hongtao Liu wrote: > > > > > &g

Re: [PATCH V2] Extend 16/32-bit vector bit_op patterns with (m, 0, i) alternative.

2022-07-20 Thread Uros Bizjak via Gcc-patches
> > On Tue, Jul 19, 2022 at 5:37 PM Uros Bizjak wrote: > > > > > > > > > > On Tue, Jul 19, 2022 at 8:56 AM Hongtao Liu > > > > > wrote: > > > > > > > > > > > > On Tue, Jul 19, 2022 at 2:35

Re: [PATCH V3] Extend 16/32-bit vector bit_op patterns with (m, 0, i) alternative.

2022-07-20 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 21, 2022 at 7:19 AM liuhongt wrote: > > And split it after reload. > > gcc/ChangeLog: > > PR target/106038 > * config/i386/mmx.md (3): New define_expand, it's > original "3". > (*3): New define_insn, it's original > "3" be extended to handle memo

Re: [x86 PATCH] PR target/106303: Fix TImode STV related failures.

2022-07-24 Thread Uros Bizjak via Gcc-patches
On Sat, Jul 23, 2022 at 9:32 AM Roger Sayle wrote: > > > This patch resolves PR target/106303 (and the related PRs 106347, > 106404, 106407) which are ICEs caused by my improvements to x86_64's > 128-bit TImode to V1TImode Scalar to Vector (STV) pass. My apologies > for the breakage. The issue i

Re: [x86 PATCH take #3] PR target/91681: zero_extendditi2 pattern for more optimizations.

2022-07-24 Thread Uros Bizjak via Gcc-patches
On Sat, Jul 23, 2022 at 10:51 AM Roger Sayle wrote: > > > > Hi Uros, > > This is the next iteration of the zero_extendditi2 patch last reviewed here: > > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596204.html > > > > [1] The sse.md changes were split out, reviewed, approved and committed.

Re: [GCC 12] [PATCH] x86: Support 2/4/8 byte constant vector stores

2022-07-31 Thread Uros Bizjak via Gcc-patches
On Wed, Jul 27, 2022 at 4:24 PM H.J. Lu wrote: > > On Fri, Jul 1, 2022 at 8:31 AM Uros Bizjak wrote: > > > > On Thu, Jun 30, 2022 at 4:50 PM H.J. Lu wrote: > > > > > > 1. Add a predicate for constant vectors which can be converted to integer > > > constants suitable for constant integer stores.

Re: [x86 PATCH] Support logical shifts by (some) integer constants in TImode STV.

2022-07-31 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 29, 2022 at 12:18 AM Roger Sayle wrote: > > > This patch improves TImode STV by adding support for logical shifts by > integer constants that are multiples of 8. For the test case: > > __int128 a, b; > void foo() { a = b << 16; } > > on x86_64, gcc -O2 currently generates: > >

Re: [x86_64 PATCH take #2] PR target/106450: Tweak timode_remove_non_convertible_regs.

2022-07-31 Thread Uros Bizjak via Gcc-patches
On Sat, Jul 30, 2022 at 11:42 AM Roger Sayle wrote: > > > Many thanks to H.J. for pointing out a better idiom for traversing > the USEs (and also DEFs) of TImode registers in an instruction. > > This revised patched has been tested on x86_64-pc-linux-gnu with > make bootstrap and make -k check, bo

  1   2   3   4   5   6   7   8   9   10   >