From: Aaron Sawdey
In a previous fusion-combine patch for rs6000, Segher had asked me to
comment out the automatic regeneration of fusion.md. And more recently
Edelsohn pointed out that gcc_update needed to fix the timestamp of
fusion.md so it didn't get unnecessarily regenerated.
OK for trunk i
From: Aaron Sawdey
This patch implements a RTL pass that looks for pc-relative loads of the
address of an external variable using the PCREL_GOT relocation and a
single load or store that uses that external address.
Produced by a cast of thousands:
* Michael Meissner
* Peter Bergner
* Bill Sch
From: Aaron Sawdey
PR99070 is caused by a fusion pattern matching that the individual
instructions do not match when it is split later. In this case the
ld+cmpi patterns were allowing a d-form load address, which the split
condition would rightly split, however that left us with something that
co
From: Aaron Sawdey
Ping, as it has been a while.
This also includes a slight fix to make sure that all references can get
optimized.
This patch implements a RTL pass that looks for pc-relative loads of the
address of an external variable using the PCREL_GOT relocation and a
single load or store
From: Aaron Sawdey
After discussion with Richard Sandiford on IRC, he suggested adding a
new mode class MODE_OPAQUE to deal with the problems (PR 96791) we had
been having with POImode/PXImode in powerpc target. This patch is the
accumulation of changes I needed to make to add this and make it us
From: Aaron Sawdey
Richard,
Thanks for the review. I think I have resolved everything, as follows:
* I was able to remove the const_tiny_rtx initialization for
MODE_OPAQUE. If that becomes a problem it's a pretty simple matter to
use an UNSPEC to assign a constant to an opaque mode if necessa
From: Aaron Sawdey
This adds new values for insn attr type for p10 fusion. The genfusion.pl
script is modified to use them, and fusion.md regenerated to capture
the new patterns. There are also some formatting only changes to
fusion.md that apparently weren't captured after a previous commit
of g
From: Aaron Sawdey
This adds some test cases to make sure that the combine patterns for p10
fusion are working.
OK for trunk?
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/fusion-p10-ldcmpi.c: New file.
* gcc.target/powerpc/fusion-p10-2logical.c: New file.
---
.../gcc.target/po
From: Aaron Sawdey
Two more sets of combine patterns for p10 fusion. These require
the "Add insn types for fusion pairs" patch I posted earlier today.
If ok I would like to put these in gcc 12 trunk and backport for 11.2.
Thanks,
Aaron
Aaron Sawdey (2):
combine patterns for add-add fusio
From: Aaron Sawdey
This patch adds a function to genfusion.pl to add a couple
more patterns so combine can do fusion of pairs of add and
vaddudm instructions.
gcc/ChangeLog:
* gcc/config/rs6000/genfusion.pl (gen_addadd): New function.
* gcc/config/rs6000/fusion.md: Regenerate fi
From: Aaron Sawdey
This patch modifies the function in genfusion.pl for generating
the logical-logical patterns so that it can also generate the
add-logical and logical-add patterns which are very similar.
gcc/ChangeLog:
* config/rs6000/genfusion.pl (gen_logical_addsubf): Refactor to
From: Aaron Sawdey
This option is mostly being added to provide -mno-block-ops-unaligned-vsx.
The default is set the same as -mefficient-unaligned-vsx. This option will
control the use of unaligned VSX loads/stores in the inline expansion
of memcpy() and memmove(). The use case for this would be
One of the things that address_to_insn_form() is used for is determining
whether a PC-relative addressing instruction could be used. In
particular predicate pcrel_external_address and function
prefixed_paddi_p() both use it for this purpose. So what emerged in
PR/94542 is that it should be look
On 2020-04-13 10:08, will schmidt wrote:
On Fri, 2020-04-10 at 18:00 -0500, acsawdey via Gcc-patches wrote:
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 2b6613bcb7e..c77e60a718f 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -24824,15
From: Aaron Sawdey
This patch adds a new function to genfusion.pl to generate patterns for
logical-logical fusion. They are enabled by default for power10 and can
be disabled by -mno-power10-fusion-2logical or -mno-power10-fusion.
This patch builds on top of the load-cmpi patch posted earlier th
From: Aaron Sawdey
This adds some test cases to make sure that the combine patterns for p10
fusion are working.
These test cases pass on power10. OK for trunk after the 2 previous patches
for the fusion patterns go in?
Thanks!
Aaron
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/fusi
From: Aaron Sawdey
This patch changes powerpc MMA builtins to use the new opaque
mode class and use modes OO (32 bytes) and XO (64 bytes)
instead of POI/PXI. Using the opaque modes prevents
optimization from trying to do anything with vector
pair/quad, which was the problem we were seeing with th
From: Aaron Sawdey
After the MMA opaque mode patch goes in, we can re-enable
use of vector pair in the inline expansion of memcpy/memmove.
After bootstrap/regtest, OK for trunk?
Thanks,
Aaron
gcc/
* config/rs6000/rs6000.c (rs6000_option_override_internal):
Enable vector pai
From: Aaron Sawdey
After building some larger codes using opaque types and some c++ codes
using opaque types it became clear I needed to go through and look for
places where opaque types and modes needed to be handled. A whole pile
of one-liners.
If bootstrap/regtest passes for ppc64le and x86_6
From: Aaron Sawdey
After building some larger codes using opaque types and some c++ codes
using opaque types it became clear I needed to go through and look for
places where opaque types and modes needed to be handled. A whole pile
of one-liners.
If bootstrap/regtest passes for ppc64le and x86_6
From: Aaron Sawdey
Segher & Bergner -
Thanks for the reviews, here's the updated patch after fixing those things.
We now have an UNSPEC for xxsetaccz, and an accompanying change to
rs6000_rtx_costs to make it be cost 0 so that CSE doesn't try to replace it
with a bunch of register moves.
If bo
From: Aaron Sawdey
This patch adds the first batch of patterns to support p10 fusion. These
will allow combine to create a single insn for a pair of instructions
that that power10 can fuse and execute. These particular ones have the
requirement that only cr0 can be used when fusing a load with a
From: Aaron Sawdey
Ping. I've folded in the changes to comments suggested by Will Schmidt.
This patch implements a RTL pass that looks for pc-relative loads of the
address of an external variable using the PCREL_GOT relocation and a
single load or store that uses that external address.
Produced
From: Aaron Sawdey
This patch implements a RTL pass that looks for pc-relative loads of the
address of an external variable using the PCREL_GOT relocation and a
single load or store that uses that external address. It then uses the
PCREL_OPT relocation to convert that first load into a single pc-
From: Aaron Sawdey
This patch adds the first couple patterns to support p10 fusion. These
will allow combine to create a single insn for a pair of instructions
that that power10 can fuse and execute. These particular ones have the
requirement that only cr0 can be used when fusing a load with a co
25 matches
Mail list logo