gcc/ChangeLog:
PR target/110021
* config/i386/i386-expand.cc (ix86_expand_vecop_qihi2): Also require
TARGET_AVX512BW to generate truncv16hiv16qi2.
Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
Uros.
diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-
gcc/ChangeLog:
* rtl.h (rtx_addr_can_trap_p): Change return type from int to bool.
(rtx_unstable_p): Ditto.
(reg_mentioned_p): Ditto.
(reg_referenced_p): Ditto.
(reg_used_between_p): Ditto.
(reg_set_between_p): Ditto.
(modified_between_p): Ditto.
(no_labels_between_
On Mon, May 29, 2023 at 8:17 PM Roger Sayle wrote:
>
>
> This is my proposed minimal fix for PR target/109973 (hopefully suitable
> for backporting) that follows Jakub Jelinek's suggestion that we introduce
> CCZmode and CCCmode variants of ptest and vptest, so that the i386
> backend treats [v]pt
On Tue, May 30, 2023 at 9:39 AM Uros Bizjak wrote:
>
> On Mon, May 29, 2023 at 8:17 PM Roger Sayle
> wrote:
> >
> >
> > This is my proposed minimal fix for PR target/109973 (hopefully suitable
> > for backporting) that follows Jakub Jelinek's suggestion that we introduce
> > CCZmode and CCCmode
gcc/ChangeLog:
* rtl.h (comparison_dominates_p): Change return type from int to bool.
(condjump_p): Ditto.
(any_condjump_p): Ditto.
(any_uncondjump_p): Ditto.
(simplejump_p): Ditto.
(returnjump_p): Ditto.
(eh_returnjump_p): Ditto.
(onlyjump_p): Ditto.
(invert_ju
On Wed, May 31, 2023 at 9:17 AM Richard Biener
wrote:
>
> On Tue, May 30, 2023 at 9:01 PM Jeff Law via Gcc-patches
> wrote:
> >
> >
> >
> > On 5/30/23 08:36, Uros Bizjak via Gcc-patches wrote:
> > > gcc/ChangeLog:
> > >
> > > * rt
On Wed, May 31, 2023 at 9:40 AM Kewen.Lin wrote:
>
> Hi Andreas,
>
> on 2023/5/25 15:25, Andreas Krebbel wrote:
> > On 3/20/23 07:33, Kewen.Lin wrote:
> >> Hi,
> >>
> >> One of my workmates found there is a warning like:
> >>
> >> libgcc/config/rs6000/morestack.S:402: Warning: ignoring
> >>
Also remove a bunch of unneeded forward declarations.
gcc/ChangeLog:
* rtl.h (true_dependence): Change return type from int to bool.
(canon_true_dependence): Ditto.
(read_dependence): Ditto.
(anti_dependence): Ditto.
(canon_anti_dependence): Ditto.
(output_dependence): Dit
Also fix some stalled comments.
gcc/ChangeLog:
* rtl.h (subreg_lowpart_p): Change return type from int to bool.
(active_insn_p): Ditto.
(in_sequence_p): Ditto.
(unshare_all_rtl): Change return type from int to void.
* emit-rtl.h (mem_expr_equal_p): Change return type from int
Also change some function arguments to bool and remove one instance
of always zero function argument.
gcc/ChangeLog:
* rtl.h (exp_equiv_p): Change return type from int to bool.
* cse.cc (mention_regs): Change return type from int to bool
and adjust function body accordingly.
(exp_
On Fri, Jun 2, 2023 at 2:49 AM liuhongt wrote:
>
> Add missing insn patterns for v2si -> v2hi/v2qi and v2hi-> v2qi vector
> truncate.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/92658
> * config/i386/mmx.md (truncv2
Also change some internal variables to bool and recode handling of
boolean varialbes to not use bitwise or.
gcc/ChangeLog:
* rtl.h (stack_regs_mentioned): Change return type from int to bool.
* reg-stack.cc (struct_block_info_def): Change "done" to bool.
(stack_regs_mentioned_p): Chan
On Sat, Jun 3, 2023 at 7:31 PM Roger Sayle wrote:
>
>
> This patch fixes PR target/110083, an ICE-on-valid regression exposed by
> my recent PTEST improvements (to address PR target/109973). The latent
> bug (admittedly mine) is that the scalar-to-vector (STV) pass doesn't update
> or delete REG_
On Sun, Jun 4, 2023 at 12:45 AM Roger Sayle wrote:
>
>
> This patch is the latest revision of my patch to add support for the
> STC (set carry flag), CLC (clear carry flag) and CMC (complement
> carry flag) instructions to the i386 backend, incorporating Uros'
> previous feedback. The significant
gcc/ChangeLog:
* rtl.h (reg_classes_intersect_p): Change return type from int to bool.
(reg_class_subset_p): Ditto.
* reginfo.cc (reg_classes_intersect_p): Ditto.
(reg_class_subset_p): Ditto.
Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
Uros
diff --git a/gcc/re
Also change one internal variable to bool.
gcc/ChangeLog:
* rtl.h (print_rtl_single): Change return type from int to void.
(print_rtl_single_with_indent): Ditto.
* print-rtl.h (class rtx_writer): Ditto. Change m_sawclose to bool.
* print-rtl.cc (rtx_writer::rtx_writer): Update fo
On Tue, Jun 6, 2023 at 6:33 AM liuhongt via Gcc-patches
wrote:
>
> r14-1145 fold the intrinsics into gimple ABS_EXPR which has UB for
> TYPE_MIN, but PABSB will store unsigned result into dst. The patch
> uses ABSU_EXPR + VCE instead of ABS_EXPR.
>
> Also don't fold _mm_abs_{pi8,pi16,pi32} w/o TAR
On Tue, Jun 6, 2023 at 6:33 AM liuhongt via Gcc-patches
wrote:
>
> r14-1145 fold the intrinsics into gimple ABS_EXPR which has UB for
> TYPE_MIN, but PABSB will store unsigned result into dst. The patch
> uses ABSU_EXPR + VCE instead of ABS_EXPR.
>
> Also don't fold _mm_abs_{pi8,pi16,pi32} w/o TAR
On Tue, Jun 6, 2023 at 1:42 PM Hongtao Liu wrote:
>
> On Tue, Jun 6, 2023 at 5:11 PM Uros Bizjak wrote:
> >
> > On Tue, Jun 6, 2023 at 6:33 AM liuhongt via Gcc-patches
> > wrote:
> > >
> > > r14-1145 fold the intrinsics into gimple ABS_EXPR which has UB for
> > > TYPE_MIN, but PABSB will store u
gcc/ChangeLog:
* rtl.h (function_invariant_p): Change return type from int to bool.
* reload1.cc (function_invariant_p): Change return type from
int to bool and adjust function body accordingly.
Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
Uros.
diff --git a/gcc/re
On Tue, Jun 6, 2023 at 5:14 PM Roger Sayle wrote:
>
>
> Hi Uros,
> This revision implements your suggestions/refinements. (i) Avoid the
> UNSPEC_CMC by using the canonical RTL idiom for *x86_cmc, (ii) Use
> peephole2s to convert x86_stc and *x86_cmc into alternate forms on
> TARGET_SLOW_STC CPUs (
On Tue, Jun 6, 2023 at 11:00 PM Roger Sayle wrote:
>
>
> Hi Uros,
> Might you willing to approve the patch without the *x86_clc pieces?
> These can be submitted later, when they are actually used. For now,
> we're arguing about the performance of a pattern that's not yet
> generated on an obsolet
On Wed, Jun 7, 2023 at 1:05 AM Roger Sayle wrote:
>
>
> This patch addresses the last remaining issue with PR target/31985, that
> GCC could make better use of memory addressing modes when implementing
> double word addition. This is achieved by adding a define_insn_and_split
> that combines an *
On Wed, Jun 7, 2023 at 8:32 AM Uros Bizjak wrote:
>
> On Wed, Jun 7, 2023 at 1:05 AM Roger Sayle wrote:
> >
> >
> > This patch addresses the last remaining issue with PR target/31985, that
> > GCC could make better use of memory addressing modes when implementing
> > double word addition. This i
On Mon, Jun 12, 2023 at 4:03 PM Roger Sayle wrote:
>
>
> The following simple test case, from PR 104610, shows that memcmp () == 0
> can result in some bizarre code sequences on x86.
>
> int foo(char *a)
> {
> static const char t[] = "0123456789012345678901234567890";
> return __builtin_me
On Tue, Jun 13, 2023 at 9:06 AM Jakub Jelinek wrote:
>
> Hi!
>
> On Tue, Jun 06, 2023 at 11:42:07PM +0200, Jakub Jelinek via Gcc-patches wrote:
> > The following patch introduces {add,sub}c5_optab and pattern recognizes
> > various forms of add with carry and subtract with carry/borrow, see
> > pr
On Tue, Jun 13, 2023 at 6:03 PM Roger Sayle wrote:
>
>
> This patch is the next instalment in a set of backend patches around
> improvements to ptest/vptest. A previous patch optimized the sequence
> t=pand(x,y); ptestz(t,t) into the equivalent ptestz(x,y), using the
> property that ZF is set to
Use default argument when callback function is not required to merge
rtx_equal_p and hash_rtx functions with their callback variants.
gcc/ChangeLog:
* cse.cc (hash_rtx_cb): Rename to hash_rtx.
(hash_rtx): Remove.
* early-remat.cc (remat_candidate_hasher::equal): Update
to call rtx
On Wed, Jun 14, 2023 at 4:00 PM Jakub Jelinek wrote:
>
> Hi!
>
> On Wed, Jun 14, 2023 at 12:35:42PM +, Richard Biener wrote:
> > At this point two pages of code without a comment - can you introduce
> > some vertical spacing and comments as to what is matched now? The
> > split out functions
On Wed, Jun 14, 2023 at 4:00 PM Jakub Jelinek wrote:
>
> Hi!
>
> On Wed, Jun 14, 2023 at 12:35:42PM +, Richard Biener wrote:
> > At this point two pages of code without a comment - can you introduce
> > some vertical spacing and comments as to what is matched now? The
> > split out functions
On Wed, Jun 14, 2023 at 4:56 PM Jakub Jelinek wrote:
>
> On Wed, Jun 14, 2023 at 04:34:27PM +0200, Uros Bizjak wrote:
> > LGTM for the x86 part. I did my best, but those peephole2 patterns are
> > real PITA to be reviewed thoroughly.
> >
> > Maybe split out peephole2 pack to a separate patch, foll
On Thu, Jun 15, 2023 at 8:03 AM Jan Beulich via Gcc-patches
wrote:
>
> The input constraint for the %vmovddup alternative was wrong, as the
> upper 16 XMM registers require AVX512VL to be used with this insn. To
> compensate, introduce a new alternative permitting all 32 registers, by
> broadcasti
On Thu, Jun 15, 2023 at 10:15 AM Jan Beulich wrote:
>
> On 15.06.2023 09:45, Hongtao Liu wrote:
> > On Thu, Jun 15, 2023 at 3:07 PM Uros Bizjak via Gcc-patches
> > wrote:
> >> On Thu, Jun 15, 2023 at 8:03 AM Jan Beulich via Gcc-patches
> >> wrote:
&g
On Fri, Jun 16, 2023 at 12:04 AM Roger Sayle wrote:
>
>
> Hi Uros,
>
> > On the 7th June 2023, Uros Bizkak wrote:
> > The register allocator considers the instruction-to-be-split as one
> > instruction, so it
> > can allocate output register to match an input register (or a register that
> > for
On Tue, Jul 11, 2023 at 9:07 PM Roger Sayle wrote:
>
>
> This patch fixes the regression PR target/110598 caused by my recent
> addition of a peephole2. The intention of that optimization was to
> simplify zeroing a register, followed by an IOR, XOR or PLUS operation
> on it into a move, or as de
On Tue, Jul 11, 2023 at 10:07 PM Roger Sayle wrote:
>
>
> The recent change in TImode parameter passing on x86_64 results in the
> FAIL of pr91681-1.c. The issue is that with the extra flexibility,
> the combine pass is now spoilt for choice between using either the
> *add3_doubleword_concat or t
t;> > On Mon, Jul 10, 2023 at 11:26 AM Uros Bizjak wrote:
> >> > >
> >> > > On Mon, Jul 10, 2023 at 11:17 AM Richard Biener
> >> > > wrote:
> >> > > >
> >> > > > On Sun, Jul 9, 2023 at 10:53 AM Uros Bizjak vi
t;> On Mon, Jul 10, 2023 at 11:47 AM Richard Biener
> > >> wrote:
> > >> >
> > >> > On Mon, Jul 10, 2023 at 11:26 AM Uros Bizjak wrote:
> > >> > >
> > >> > > On Mon, Jul 10, 2023 at 11:17 AM Richard
Also change some internal variables and function arguments from int to bool.
gcc/ChangeLog:
* ifcvt.cc (cond_exec_changed_p): Change variable to bool.
(last_active_insn): Change "skip_use_p" function argument to bool.
(noce_operand_ok): Change return type from int to bool.
(find_c
gcc/ChangeLog:
* ira.cc (equiv_init_varies_p): Change return type from int to bool
and adjust function body accordingly.
(equiv_init_movable_p): Ditto.
(memref_used_between_p): Ditto.
* lra-constraints.cc (valid_address_p): Ditto.
Bootstrapped and regression tested on x86_64-l
PR target/106966
gcc/ChangeLog:
* config/alpha/alpha.cc (alpha_emit_set_long_const):
Always use DImode when constructing long const.
gcc/testsuite/ChangeLog:
* gcc.target/alpha/pr106966.c: New test.
Bootstrapped and regression tested by Matthias on alpha-linux-gnu.
Uros.
diff
On Thu, Jul 13, 2023 at 7:10 PM Roger Sayle wrote:
>
>
> This patch resolves PR target/110588 to catch another case in combine
> where the i386 backend should be generating a btl instruction. This adds
> another define_insn_and_split to recognize the RTL representation for this
> case.
>
> I also
cprop1 pass does not consider paradoxical subreg and for (insn 22) claims
that it equals 8 elements of HImodeby setting REG_EQUAL note:
(insn 21 19 22 4 (set (reg:V4QI 98)
(mem/u/c:V4QI (symbol_ref/u:DI ("*.LC1") [flags 0x2]) [0 S4
A32])) "pr110206.c":12:42 1530 {*movv4qi_internal}
(
On Fri, Jul 14, 2023 at 8:24 AM Haochen Jiang wrote:
>
> Hi all,
>
> This patch aims to auto vectorize usdot_prod and udot_prod with newly
> introduced AVX-VNNI-INT16.
>
> Also I refined the redundant mode iterator in the patch.
>
> Regtested on x86_64-pc-linux-gnu. Ok for trunk after AVX-VNNI-INT
On Thu, Jul 13, 2023 at 6:45 PM Roger Sayle wrote:
>
>
> This is the next piece towards a fix for (the x86_64 ABI issues affecting)
> PR 88873. This patch generalizes the recent tweak to ix86_expand_move
> for setting the highpart of a TImode reg from a DImode source using
> *insvti_highpart_1, t
On Fri, Jul 14, 2023 at 10:31 AM Richard Biener wrote:
>
> On Fri, 14 Jul 2023, Uros Bizjak wrote:
>
> > cprop1 pass does not consider paradoxical subreg and for (insn 22) claims
> > that it equals 8 elements of HImodeby setting REG_EQUAL note:
> >
> > (insn 21 19 22 4 (set (reg:V4QI 98)
> >
On Fri, Jul 14, 2023 at 10:53 AM Richard Biener wrote:
>
> On Fri, 14 Jul 2023, Uros Bizjak wrote:
>
> > On Fri, Jul 14, 2023 at 10:31?AM Richard Biener wrote:
> > >
> > > On Fri, 14 Jul 2023, Uros Bizjak wrote:
> > >
> > > > cprop1 pass does not consider paradoxical subreg and for (insn 22)
> >
On Fri, Jul 14, 2023 at 11:27 AM Roger Sayle wrote:
>
>
> > From: Uros Bizjak
> > Sent: 13 July 2023 19:21
> >
> > On Thu, Jul 13, 2023 at 7:10 PM Roger Sayle
> > wrote:
> > >
> > > This patch resolves PR target/110588 to catch another case in combine
> > > where the i386 backend should be gener
On Fri, Jul 14, 2023 at 11:44 AM Jan Beulich wrote:
>
> The corresponding insn serves this purpose quite fine, and leads to
> slightly less (generated) code. All we need is the insn to not have a
> leading * in its name, while retaining that * for "extendhfsf2".
> Introduce a mode attribute in exc
On Mon, Jul 17, 2023 at 8:44 AM Hongtao Liu wrote:
>
> Ping.
>
> On Tue, Jul 11, 2023 at 5:16 PM liuhongt via Gcc-patches
> wrote:
> >
> > Similar like we did for CMPXCHG, but extended to all
> > ix86_comparison_int_operator since CMPCCXADD set EFLAGS exactly same
> > as CMP.
> >
> > When operand
On Mon, Jul 17, 2023 at 10:28 AM Hongtao Liu wrote:
>
> I'd like to ping for this patch (only patch 1/2, for patch 2/2, I
> think that may not be necessary).
>
> On Mon, May 15, 2023 at 9:20 AM Hongtao Liu wrote:
> >
> > ping.
> >
> > On Fri, Apr 21, 2023 at 9:55 PM liuhongt wrote:
> > >
> > > >
Also change some internal variables and function arguments from int to bool.
gcc/ChangeLog:
* combine.cc (struct reg_stat_type): Change last_set_invalid to bool.
(cant_combine_insn_p): Change return type from int to bool and adjust
function body accordingly.
(can_combine_p): Ditto
Also change some internal variables and function arguments from int to bool.
gcc/ChangeLog:
* dwarf2asm.cc: Change FALSE to false.
* dwarf2cfi.cc (execute_dwarf2_frame): Change return type to void.
* dwarf2out.cc (matches_main_base): Change return type from
int to bool. Change "l
On Wed, Jul 19, 2023 at 2:21 PM Richard Biener
wrote:
>
> On Sun, Jun 11, 2023 at 12:55 AM Roger Sayle
> wrote:
> >
> >
> > This is a backport of the fixes for PR target/109973 and PR target/110083.
> >
> > This backport to the releases/gcc-13 branch has been tested on
> > x86_64-pc-linux-gnu wi
On Wed, Jul 19, 2023 at 10:07 PM Roger Sayle wrote:
>
>
> This patch is the next piece of a solution to the x86_64 ABI issues in
> PR 88873. This splits the *concat3_3 define_insn_and_split
> into two patterns, a TARGET_64BIT *concatditi3_3 and a !TARGET_64BIT
> *concatsidi3_3. This allows us to
On Thu, Jul 20, 2023 at 9:44 AM Roger Sayle wrote:
>
>
> Hi Uros,
>
> > From: Uros Bizjak
> > Sent: 20 July 2023 07:50
> >
> > On Wed, Jul 19, 2023 at 10:07 PM Roger Sayle
> > wrote:
> > >
> > > This patch is the next piece of a solution to the x86_64 ABI issues in
> > > PR 88873. This splits t
On Thu, Jul 20, 2023 at 9:35 AM liuhongt wrote:
>
> For Intel processors, after TARGET_AVX, vmovdqu is optimized as fast
> as vlddqu, UNSPEC_LDDQU can be removed to enable more optimizations.
> Can someone confirm this with AMD folks?
> If AMD doesn't like such optimization, I'll put my optimizati
When sign-extending the value in a double-word register pair using shift and
ashiftrt sequence with the same count immediate value less than word width,
there is no need to shift the lower word of the value. The sign-extension
could be limited to the upper word, but we uselessly shift the lower wor
On Sat, Jul 22, 2023 at 5:37 PM Roger Sayle wrote:
>
>
> As suggested by Uros, this patch changes the ZERO_EXTRACTs and SIGN_EXTRACTs
> in i386.md to consistently use QImode for bit offsets (i.e. third and fourth
> operands), matching the use of QImode for bit counts in shifts and rotates.
>
> The
On Sat, Jul 22, 2023 at 4:17 PM Roger Sayle wrote:
>
>
> This patch attempts to help with PR rtl-optimization/110587, a regression
> of -O0 compile time for the pathological pr28071.c. My recent patch helps
> a bit, but hasn't returned -O0 compile-time to where it was before my
> ix86_expand_move
Clear the upper half of a V4SFmode operand register in front of all
potentially trapping instructions. The testcase:
--cut here--
typedef float v2sf __attribute__((vector_size(8)));
typedef float v4sf __attribute__((vector_size(16)));
v2sf test(v4sf x, v4sf y)
{
v2sf x2, y2;
x2 = __builtin_s
The testcase should use dg-additional-options instead of dg-options to
not overwrite default compile flags that include path for finding
the IEEE modules.
gcc/testsuite/ChangeLog:
* gfortran.dg/ieee/comparisons_3.F90: Use dg-additional-options
instead of dg-options.
Tested on x86_64-linu
Also introduce -m[no-]mmxfp-with-sse option to disable trapping V2SF
named patterns in order to avoid generation of partial vector V4SFmode
trapping instructions.
The new option is enabled by default, because even with sanitization,
a small but consistent speed up of 2 to 3% with Polyhedron capaci
On Mon, Jul 31, 2023 at 11:40 AM Richard Biener wrote:
>
> On Sun, 30 Jul 2023, Uros Bizjak wrote:
>
> > Also introduce -m[no-]mmxfp-with-sse option to disable trapping V2SF
> > named patterns in order to avoid generation of partial vector V4SFmode
> > trapping instructions.
> >
> > The new option
On Wed, Aug 2, 2023 at 3:33 AM liuhongt wrote:
>
> In [1], I propose a patch to generate vmovdqu for all vlddqu intrinsics
> after AVX2, it's rejected as
> > The instruction is reachable only as __builtin_ia32_lddqu* (aka
> > _mm_lddqu_si*), so it was chosen by the programmer for a reason. I
> > t
On Fri, Jul 1, 2022 at 1:00 AM Roger Sayle wrote:
>
>
> When optimizing for size with -Oz, setting a register can be minimized by
> pushing an immediate value to the stack and popping it to the destination.
> Alas the one general register that shouldn't be updated via the stack is
> the stack poin
On Fri, Jul 1, 2022 at 12:17 PM Roger Sayle wrote:
>
>
> Many thanks to Uros for spotting that I'd forgotten to add constraints
> to the new define_insn_and_split *andn_doubleword_bmi when moving it
> from pre-reload to post-reload. I've pushed this obvious fix after a
> make bootstrap on x86_64-
ANDN is non-destructive, so use "r" instead of "0" for its operand 1 constraint.
2022-07-01 Uroš Bizjak
gcc/ChangeLog:
* config/i386/i386.md (*andn3_doubleword_bmi):
Use "r" constraint for operand 1.
Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
Pushed to master.
On Thu, Jun 30, 2022 at 4:50 PM H.J. Lu wrote:
>
> 1. Add a predicate for constant vectors which can be converted to integer
> constants suitable for constant integer stores. For a 8-byte constant
> vector, the converted 64-bit integer must be valid for store with 64-bit
> immediate, which is a 6
On Mon, Jul 4, 2022 at 7:10 AM Jiang, Haochen wrote:
>
> Hi all,
>
> I revised my patch according to all your reviews.
>
> Regtested on x86_64-pc-linux-gnu.
OK.
Thanks,
Uros.
>
> BRs,
> Haochen
>
> > -Original Message-
> > From: Liu, Hongtao
> > Sent: Thursday, June 30, 2022 4:57 PM
>
On Mon, Jul 4, 2022 at 7:27 PM Roger Sayle wrote:
>
>
> Hi Uros,
> Thanks for the review. This patch implements all of your suggestions, both
> removing ix86_pre_reload_split from the combine splitter(s), and dividing
> the original splitter up into four simpler variants, that use match_dup to
>
On Mon, Jul 4, 2022 at 9:11 PM Roger Sayle wrote:
>
>
> This patch is the latest revision of the patch originally posted at:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596201.html
>
> This patch extends the earlier and;cmp to not;test optimization to also
> perform this transformation f
On Tue, Jul 5, 2022 at 9:56 AM Uros Bizjak wrote:
>
> On Mon, Jul 4, 2022 at 9:11 PM Roger Sayle wrote:
> >
> >
> > This patch is the latest revision of the patch originally posted at:
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596201.html
> >
> > This patch extends the earlier and;c
On Thu, Jul 7, 2022 at 7:52 AM Haochen Jiang wrote:
>
> Hi all,
>
> This patch aim to fix the ICE for vec unpack using for memory after the commit
> r13-1418 on inproper insn of cvtps2pd.
>
> Regtested on x86_64-pc-linux-gnu. Ok for trunk?
>
> BRs,
> Haochen
>
> gcc/ChangeLog:
>
> PR targe
On Thu, Jul 7, 2022 at 6:41 PM Roger Sayle wrote:
>
>
> This patch fixes the current two FAILs of pr65105-5.c on x86 when
> compiled with -m32. These (temporary) breakages were fallout from my
> patches to improve/upgrade (scalar) double word comparisons.
> On mainline, the i386 backend currently
On Fri, Jul 8, 2022 at 9:15 AM Roger Sayle wrote:
>
>
> This patch adds support for x86's single-byte encoded stc (set carry flag)
> and clc (clear carry flag) instructions to i386.md.
>
> The motivating example is the simple code snippet:
>
> unsigned int foo (unsigned int a, unsigned int b, unsi
On Fri, Jul 8, 2022 at 9:15 AM Roger Sayle wrote:
>
>
> This patch adds support for x86's single-byte encoded stc (set carry flag)
> and clc (clear carry flag) instructions to i386.md.
>
> The motivating example is the simple code snippet:
>
> unsigned int foo (unsigned int a, unsigned int b, unsi
On Sat, Jul 9, 2022 at 11:26 AM Roger Sayle wrote:
>
>
> This is a backport of the fix for PR target/105930 from mainline to the
> gcc12 release branch. This patch has been retested against the gcc12
> branch on x86_64-pc-linux-gnu with make bootstrap and make -k check,
> both with and without --
On Sat, Jul 9, 2022 at 2:17 PM Roger Sayle wrote:
>
>
> This patch upgrades x86_64's scalar-to-vector (STV) pass to more
> aggressively transform 128-bit scalar TImode operations into vector
> V1TImode operations performed on SSE registers. TImode functionality
> already exists in STV, but only f
On Sun, Jul 10, 2022 at 8:36 PM Roger Sayle wrote:
>
>
> Hi Uros,
> Yes, I agree. I think it makes sense to have a single STV pass (after
> combine and before reload). Let's hear what HJ thinks, but I'm
> happy to investigate a follow-up patch that unifies the STV passes.
> But it'll be easier t
On Mon, Jul 11, 2022 at 3:15 AM liuhongt wrote:
>
> And split it to GPR-version instruction after reload.
>
> This will enable below optimization for 16/32/64-bit vector bit_op
>
> - movd(%rdi), %xmm0
> - movd(%rsi), %xmm1
> - pand%xmm1, %xmm0
> - movd%xmm0,
On Tue, Jul 12, 2022 at 8:37 AM Hongtao Liu wrote:
>
> On Mon, Jul 11, 2022 at 4:03 PM Uros Bizjak via Gcc-patches
> wrote:
> >
> > On Mon, Jul 11, 2022 at 3:15 AM liuhongt wrote:
> > >
> > > And split it to GPR-version instruction after reload.
> >
On Thu, Jul 14, 2022 at 7:33 AM liuhongt wrote:
>
> And split it to GPR-version instruction after reload.
>
> > ?r was introduced under the assumption that we want vector values
> > mostly in vector registers. Currently there are no instructions with
> > memory or immediate operand, so that made s
On Thu, Jul 14, 2022 at 11:32 AM Hongtao Liu wrote:
>
> On Thu, Jul 14, 2022 at 3:22 PM Uros Bizjak via Gcc-patches
> wrote:
> >
> > On Thu, Jul 14, 2022 at 7:33 AM liuhongt wrote:
> > >
> > > And split it to GPR-version instruction after reload.
&g
On Thu, Jul 14, 2022 at 6:58 PM Roger Sayle wrote:
>
>
> This patch resolves PR target/106278 a regression on x86_64 caused by my
> recent TImode STV improvements. Now that TImode STV can handle comparisons
> such as "(set (regs:CC) (compare:CC (reg:TI) ...))" the convert_insn method
> sensibly c
On Fri, Jul 15, 2022 at 3:28 PM Roger Sayle wrote:
>
>
>
> This patch resolves PR target/106273 which is a wrong code regression
>
> caused by the recent reorganization to split doubleword operations after
>
> reload on x86. For the failing test case, the constraints on the
>
> andnti3_doubleword
On Sat, Jul 16, 2022 at 2:06 PM Roger Sayle wrote:
>
>
> This patch fixes (what I believe is) a latent bug in i386.md's
> x86_64_const_vector_operand define_predicate. According to the
> documentation, when a predicate is called with rtx operand OP and
> machine_mode operand MODE, we can't should
On Sat, Jul 16, 2022 at 9:10 PM Roger Sayle wrote:
>
>
> This patch resolves PR target/106231 by providing insns that recognize
> (zero_extend:DI (ctz:SI ...)) and (sign_extend:DI (ctz:SI ...)). The
> result of ctz:SI is always between 0 and 32 (or undefined), so
> sign_extension is the same as z
On Mon, Jul 18, 2022 at 3:59 AM liuhongt wrote:
>
> And split it after reload.
>
> >IMO, the only case it is worth adding is a direct immediate store to
> >memory, which HJ recently added.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
>
On Tue, Jul 19, 2022 at 8:07 AM liuhongt wrote:
>
> And split it after reload.
>
> > You will need ix86_binary_operator_ok insn constraint here with
> > corresponding expander using ix86_fixup_binary_operands_no_copy to
> > prepare insn operands.
> Split define_expand with just register_operand, a
On Tue, Jul 19, 2022 at 8:56 AM Hongtao Liu wrote:
>
> On Tue, Jul 19, 2022 at 2:35 PM Uros Bizjak via Gcc-patches
> wrote:
> >
> > On Tue, Jul 19, 2022 at 8:07 AM liuhongt wrote:
> > >
> > > And split it after reload.
> > >
> > > >
On Wed, Jul 20, 2022 at 4:37 AM Hongtao Liu wrote:
>
> On Tue, Jul 19, 2022 at 5:37 PM Uros Bizjak wrote:
> >
> > On Tue, Jul 19, 2022 at 8:56 AM Hongtao Liu wrote:
> > >
> > > On Tue, Jul 19, 2022 at 2:35 PM Uros Bizjak via Gcc-patches
> > > wrote:
On Wed, Jul 20, 2022 at 8:14 AM Uros Bizjak wrote:
>
> On Wed, Jul 20, 2022 at 4:37 AM Hongtao Liu wrote:
> >
> > On Tue, Jul 19, 2022 at 5:37 PM Uros Bizjak wrote:
> > >
> > > On Tue, Jul 19, 2022 at 8:56 AM Hongtao Liu wrote:
> > > >
> &g
> > On Tue, Jul 19, 2022 at 5:37 PM Uros Bizjak wrote:
> > > > >
> > > > > On Tue, Jul 19, 2022 at 8:56 AM Hongtao Liu
> > > > > wrote:
> > > > > >
> > > > > > On Tue, Jul 19, 2022 at 2:35
On Thu, Jul 21, 2022 at 7:19 AM liuhongt wrote:
>
> And split it after reload.
>
> gcc/ChangeLog:
>
> PR target/106038
> * config/i386/mmx.md (3): New define_expand, it's
> original "3".
> (*3): New define_insn, it's original
> "3" be extended to handle memo
On Sat, Jul 23, 2022 at 9:32 AM Roger Sayle wrote:
>
>
> This patch resolves PR target/106303 (and the related PRs 106347,
> 106404, 106407) which are ICEs caused by my improvements to x86_64's
> 128-bit TImode to V1TImode Scalar to Vector (STV) pass. My apologies
> for the breakage. The issue i
On Sat, Jul 23, 2022 at 10:51 AM Roger Sayle wrote:
>
>
>
> Hi Uros,
>
> This is the next iteration of the zero_extendditi2 patch last reviewed here:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596204.html
>
>
>
> [1] The sse.md changes were split out, reviewed, approved and committed.
On Wed, Jul 27, 2022 at 4:24 PM H.J. Lu wrote:
>
> On Fri, Jul 1, 2022 at 8:31 AM Uros Bizjak wrote:
> >
> > On Thu, Jun 30, 2022 at 4:50 PM H.J. Lu wrote:
> > >
> > > 1. Add a predicate for constant vectors which can be converted to integer
> > > constants suitable for constant integer stores.
On Fri, Jul 29, 2022 at 12:18 AM Roger Sayle wrote:
>
>
> This patch improves TImode STV by adding support for logical shifts by
> integer constants that are multiples of 8. For the test case:
>
> __int128 a, b;
> void foo() { a = b << 16; }
>
> on x86_64, gcc -O2 currently generates:
>
>
On Sat, Jul 30, 2022 at 11:42 AM Roger Sayle wrote:
>
>
> Many thanks to H.J. for pointing out a better idiom for traversing
> the USEs (and also DEFs) of TImode registers in an instruction.
>
> This revised patched has been tested on x86_64-pc-linux-gnu with
> make bootstrap and make -k check, bo
1 - 100 of 1175 matches
Mail list logo