[PATCH,rs6000] do not generate fusion.md, update contrib/gcc_update

2021-02-01 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

In a previous fusion-combine patch for rs6000, Segher had asked me to
comment out the automatic regeneration of fusion.md. And more recently
Edelsohn pointed out that gcc_update needed to fix the timestamp of
fusion.md so it didn't get unnecessarily regenerated.

OK for trunk if bootstrap/regtest passes?

Thanks,
   Aaron

contrib/ChangeLog:

* gcc_update (files_and_dependencies): Add dependency for
gcc/config/rs6000/fusion.md on gcc/config/rs6000/genfusion.md.

gcc/ChangeLog:

* config/rs6000/t-rs6000: Comment out auto generation of
fusion.md for now.
---
 contrib/gcc_update | 1 +
 gcc/config/rs6000/t-rs6000 | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/contrib/gcc_update b/contrib/gcc_update
index 43d284d8125..45a27b76cc3 100755
--- a/contrib/gcc_update
+++ b/contrib/gcc_update
@@ -89,6 +89,7 @@ gcc/config/c6x/c6x-mult.md: gcc/config/c6x/c6x-mult.md.in 
gcc/config/c6x/genmult
 gcc/config/m68k/m68k-tables.opt: gcc/config/m68k/m68k-devices.def 
gcc/config/m68k/m68k-isas.def gcc/config/m68k/m68k-microarchs.def 
gcc/config/m68k/genopt.sh
 gcc/config/mips/mips-tables.opt: gcc/config/mips/mips-cpus.def 
gcc/config/mips/genopt.sh
 gcc/config/rs6000/rs6000-tables.opt: gcc/config/rs6000/rs6000-cpus.def 
gcc/config/rs6000/genopt.sh
+gcc/config/rs6000/fusion.md: gcc/config/rs6000/genfusion.pl
 gcc/config/tilegx/mul-tables.c: gcc/config/tilepro/gen-mul-tables.cc
 gcc/config/tilepro/mul-tables.c: gcc/config/tilepro/gen-mul-tables.cc
 # And then, language-specific files
diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs6000
index e3a58bf31bf..1541a653738 100644
--- a/gcc/config/rs6000/t-rs6000
+++ b/gcc/config/rs6000/t-rs6000
@@ -47,8 +47,8 @@ rs6000-call.o: $(srcdir)/config/rs6000/rs6000-call.c
$(COMPILE) $<
$(POSTCOMPILE)
 
-$(srcdir)/config/rs6000/fusion.md: $(srcdir)/config/rs6000/genfusion.pl
-   $(srcdir)/config/rs6000/genfusion.pl > $(srcdir)/config/rs6000/fusion.md
+#$(srcdir)/config/rs6000/fusion.md: $(srcdir)/config/rs6000/genfusion.pl
+#  $(srcdir)/config/rs6000/genfusion.pl > $(srcdir)/config/rs6000/fusion.md
 
 $(srcdir)/config/rs6000/rs6000-tables.opt: $(srcdir)/config/rs6000/genopt.sh \
   $(srcdir)/config/rs6000/rs6000-cpus.def
-- 
2.27.0



[PATCH,rs6000] [v2] Optimize pcrel access of globals

2021-02-22 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

This patch implements a RTL pass that looks for pc-relative loads of the
address of an external variable using the PCREL_GOT relocation and a
single load or store that uses that external address.

Produced by a cast of thousands:
 * Michael Meissner
 * Peter Bergner
 * Bill Schmidt
 * Alan Modra
 * Segher Boessenkool
 * Aaron Sawdey

This incorporates the changes requested in Segher's review. A few things I
did not change were the insn-at-a-time scan that could be done with DF, and
I did not change to using statistics.[ch] for the counters struct. I did try
to improve the naming, and rewrote a number of comments to make them consistent
with the code, and generally tried to make things more readable.

OK for trunk if bootstrap/regtest passes?

Thanks!
   Aaron

gcc/ChangeLog:

* config.gcc: Add pcrel-opt.o.
* config/rs6000/pcrel-opt.c: New file.
* config/rs6000/pcrel-opt.md: New file.
* config/rs6000/predicates.md: Add d_form_memory predicate.
* config/rs6000/rs6000-cpus.def: Add OPTION_MASK_PCREL_OPT.
* config/rs6000/rs6000-passes.def: Add pass_pcrel_opt.
* config/rs6000/rs6000-protos.h: Add reg_to_non_prefixed(),
pcrel_opt_valid_mem_p(), output_pcrel_opt_reloc(),
and make_pass_pcrel_opt().
* config/rs6000/rs6000.c (reg_to_non_prefixed): Make global.
(rs6000_option_override_internal): Add pcrel-opt.
(rs6000_delegitimize_address): Support pcrel-opt.
(rs6000_opt_masks): Add pcrel-opt.
(pcrel_opt_valid_mem_p): New function.
(reg_to_non_prefixed): Make global.
(rs6000_asm_output_opcode): Reset next_insn_prefixed_p.
(output_pcrel_opt_reloc): New function.
* config/rs6000/rs6000.md (loads_extern_addr): New attr.
(pcrel_extern_addr): Set loads_extern_addr.
Add include for pcrel-opt.md.
* config/rs6000/rs6000.opt: Add -mpcrel-opt.
* config/rs6000/t-rs6000: Add rules for pcrel-opt.c and
pcrel-opt.md.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pcrel-opt-inc-di.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-df.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-di.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-hi.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-qi.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-sf.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-si.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-vector.c: New test.
* gcc.target/powerpc/pcrel-opt-st-df.c: New test.
* gcc.target/powerpc/pcrel-opt-st-di.c: New test.
* gcc.target/powerpc/pcrel-opt-st-hi.c: New test.
* gcc.target/powerpc/pcrel-opt-st-qi.c: New test.
* gcc.target/powerpc/pcrel-opt-st-sf.c: New test.
* gcc.target/powerpc/pcrel-opt-st-si.c: New test.
* gcc.target/powerpc/pcrel-opt-st-vector.c: New test.
---
 gcc/config.gcc|   8 +-
 gcc/config/rs6000/pcrel-opt.md| 399 
 gcc/config/rs6000/predicates.md   |  21 +
 gcc/config/rs6000/rs6000-cpus.def |   2 +
 gcc/config/rs6000/rs6000-passes.def   |   8 +
 gcc/config/rs6000/rs6000-pcrel-opt.c  | 924 ++
 gcc/config/rs6000/rs6000-protos.h |   4 +
 gcc/config/rs6000/rs6000.c| 111 ++-
 gcc/config/rs6000/rs6000.md   |   8 +-
 gcc/config/rs6000/rs6000.opt  |   4 +
 gcc/config/rs6000/t-rs6000|   7 +-
 .../gcc.target/powerpc/pcrel-opt-inc-di.c |  17 +
 .../gcc.target/powerpc/pcrel-opt-ld-df.c  |  36 +
 .../gcc.target/powerpc/pcrel-opt-ld-di.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-ld-hi.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-ld-qi.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-ld-sf.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-ld-si.c  |  41 +
 .../gcc.target/powerpc/pcrel-opt-ld-vector.c  |  36 +
 .../gcc.target/powerpc/pcrel-opt-st-df.c  |  36 +
 .../gcc.target/powerpc/pcrel-opt-st-di.c  |  36 +
 .../gcc.target/powerpc/pcrel-opt-st-hi.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-st-qi.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-st-sf.c  |  36 +
 .../gcc.target/powerpc/pcrel-opt-st-si.c  |  41 +
 .../gcc.target/powerpc/pcrel-opt-st-vector.c  |  36 +
 26 files changed, 2054 insertions(+), 9 deletions(-)
 create mode 100644 gcc/config/rs6000/pcrel-opt.md
 create mode 100644 gcc/config/rs6000/rs6000-pcrel-opt.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-inc-di.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-df.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-di.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-hi.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-qi.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-sf.c
 create mode 

[PATCH,rs6000] Tighten predicates for p10 ld/cmpi fusion

2021-03-08 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

PR99070 is caused by a fusion pattern matching that the individual
instructions do not match when it is split later. In this case the
ld+cmpi patterns were allowing a d-form load address, which the split
condition would rightly split, however that left us with something that
could not be matched by a ds-form ld instruction, hence the ICE. This
only happened if the target cpu was not power10 -- if we were targeting
power10 then a prefixed pld instruction would get generated because that
can handle d-form. However this is not optimal code either.

So the solution is a new predicate (ds_form_mem_operand) that only
accepts what we can take as for a ds-form load. Then a small
modification of the genfusion.pl script changes the relevant
ld+cmpi patterns to use the new predicate.

OK for trunk if bootstrap/regtest passes?

gcc/ChangeLog

PR target/99070
* config/rs6000/predicates.md (ds_form_mem_operand) New
predicate.
* config/rs6000/genfusion.pl (gen_ld_cmpi_p10) Use
ds_form_mem_operand in ld/lwa patterns.
* config/rs6000/fusion.md: Regenerate file.
---
 gcc/config/rs6000/fusion.md | 177 
 gcc/config/rs6000/genfusion.pl  |   7 +-
 gcc/config/rs6000/predicates.md |  20 
 3 files changed, 113 insertions(+), 91 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 737a6da385f..56478fcae1d 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -1,7 +1,6 @@
-;; -*- buffer-read-only: t -*-
 ;; Generated automatically by genfusion.pl
 
-;; Copyright (C) 2020 Free Software Foundation, Inc.
+;; Copyright (C) 2020,2021 Free Software Foundation, Inc.
 ;;
 ;; This file is part of GCC.
 ;;
@@ -23,18 +22,18 @@
 ;; load mode is DI result mode is clobber compare mode is CC extend is none
 (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-(compare:CC (match_operand:DI 1 "non_update_memory_operand" "m")
- (match_operand:DI 3 "const_m1_to_1_operand" "n")))
+(compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m")
+(match_operand:DI 3 "const_m1_to_1_operand" "n")))
(clobber (match_scratch:DI 0 "=r"))]
   "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
-  "ld%X1 %0,%1\;cmpdi 0,%0,%3"
+  "ld%X1 %0,%1\;cmpdi %2,%0,%3"
   "&& reload_completed
&& (cc_reg_not_cr0_operand (operands[2], CCmode)
-   || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, 
NON_PREFIXED_DS))"
+   || !address_is_non_pfx_d_or_x (XEXP (operands[1], 0),
+  DImode, NON_PREFIXED_DS))"
   [(set (match_dup 0) (match_dup 1))
(set (match_dup 2)
-(compare:CC (match_dup 0)
-   (match_dup 3)))]
+(compare:CC (match_dup 0) (match_dup 3)))]
   ""
   [(set_attr "type" "load")
(set_attr "cost" "8")
@@ -44,18 +43,18 @@ (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
 ;; load mode is DI result mode is clobber compare mode is CCUNS extend is none
 (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none"
   [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
-(compare:CCUNS (match_operand:DI 1 "non_update_memory_operand" "m")
- (match_operand:DI 3 "const_0_to_1_operand" "n")))
+(compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m")
+   (match_operand:DI 3 "const_0_to_1_operand" "n")))
(clobber (match_scratch:DI 0 "=r"))]
   "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
-  "ld%X1 %0,%1\;cmpldi 0,%0,%3"
+  "ld%X1 %0,%1\;cmpldi %2,%0,%3"
   "&& reload_completed
&& (cc_reg_not_cr0_operand (operands[2], CCmode)
-   || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, 
NON_PREFIXED_DS))"
+   || !address_is_non_pfx_d_or_x (XEXP (operands[1], 0),
+  DImode, NON_PREFIXED_DS))"
   [(set (match_dup 0) (match_dup 1))
(set (match_dup 2)
-(compare:CCUNS (match_dup 0)
-   (match_dup 3)))]
+(compare:CCUNS (match_dup 0) (match_dup 3)))]
   ""
   [(set_attr "type" "load")
(set_attr "cost" "8")
@@ -65,18 +64,18 @@ (define_insn_and_split 
"*ld_cmpldi_cr0_DI_clobber_CCUNS_none"
 ;; load mode is DI result mode is DI compare mode is CC extend is none
 (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-(compare:CC (match_operand:DI 1 "non_update_memory_operand" "m")
- (match_operand:DI 3 "const_m1_to_1_operand" "n")))
+(compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m")
+(match_operand:DI 3 "const_m1_to_1_operand" "n")))
(set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
   "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
-  "ld%X1 %0,%1\;cmpdi 0,%0,%3"
+  "ld%X1 %0,%1\;cmpdi %2,%0,%3"
   "&& reload_completed
&& (cc

Re: [PATCH, rs6000] Optimize pcrel access of globals (updated, ping)

2020-11-04 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

Ping, as it has been a while.
This also includes a slight fix to make sure that all references can get
optimized.

This patch implements a RTL pass that looks for pc-relative loads of the
address of an external variable using the PCREL_GOT relocation and a
single load or store that uses that external address.

Produced by a cast of thousands:
 * Michael Meissner
 * Peter Bergner
 * Bill Schmidt
 * Alan Modra
 * Segher Boessenkool
 * Aaron Sawdey

Passes bootstrap/regtest on ppc64le power10. OK for trunk?

gcc/ChangeLog:

* config.gcc: Add pcrel-opt.o.
* config/rs6000/pcrel-opt.c: New file.
* config/rs6000/pcrel-opt.md: New file.
* config/rs6000/predicates.md: Add d_form_memory predicate.
* config/rs6000/rs6000-cpus.def: Add OPTION_MASK_PCREL_OPT.
* config/rs6000/rs6000-passes.def: Add pass_pcrel_opt.
* config/rs6000/rs6000-protos.h: Add reg_to_non_prefixed(),
offsettable_non_prefixed_memory(), output_pcrel_opt_reloc(),
and make_pass_pcrel_opt().
* config/rs6000/rs6000.c (reg_to_non_prefixed): Make global.
(rs6000_option_override_internal): Add pcrel-opt.
(rs6000_delegitimize_address): Support pcrel-opt.
(rs6000_opt_masks): Add pcrel-opt.
(offsettable_non_prefixed_memory): New function.
(reg_to_non_prefixed): Make global.
(rs6000_asm_output_opcode): Reset next_insn_prefixed_p.
(output_pcrel_opt_reloc): New function.
* config/rs6000/rs6000.md (loads_extern_addr): New attr.
(pcrel_extern_addr): Set loads_extern_addr.
Add include for pcrel-opt.md.
* config/rs6000/rs6000.opt: Add -mpcrel-opt.
* config/rs6000/t-rs6000: Add rules for pcrel-opt.c and
pcrel-opt.md.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pcrel-opt-inc-di.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-df.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-di.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-hi.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-qi.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-sf.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-si.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-vector.c: New test.
* gcc.target/powerpc/pcrel-opt-st-df.c: New test.
* gcc.target/powerpc/pcrel-opt-st-di.c: New test.
* gcc.target/powerpc/pcrel-opt-st-hi.c: New test.
* gcc.target/powerpc/pcrel-opt-st-qi.c: New test.
* gcc.target/powerpc/pcrel-opt-st-sf.c: New test.
* gcc.target/powerpc/pcrel-opt-st-si.c: New test.
* gcc.target/powerpc/pcrel-opt-st-vector.c: New test.
---
 gcc/config.gcc|   6 +-
 gcc/config/rs6000/pcrel-opt.c | 888 ++
 gcc/config/rs6000/pcrel-opt.md| 386 
 gcc/config/rs6000/predicates.md   |  23 +
 gcc/config/rs6000/rs6000-cpus.def |   2 +
 gcc/config/rs6000/rs6000-passes.def   |   8 +
 gcc/config/rs6000/rs6000-protos.h |   4 +
 gcc/config/rs6000/rs6000.c| 116 ++-
 gcc/config/rs6000/rs6000.md   |   8 +-
 gcc/config/rs6000/rs6000.opt  |   4 +
 gcc/config/rs6000/t-rs6000|   7 +-
 .../gcc.target/powerpc/pcrel-opt-inc-di.c |  18 +
 .../gcc.target/powerpc/pcrel-opt-ld-df.c  |  36 +
 .../gcc.target/powerpc/pcrel-opt-ld-di.c  |  43 +
 .../gcc.target/powerpc/pcrel-opt-ld-hi.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-ld-qi.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-ld-sf.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-ld-si.c  |  41 +
 .../gcc.target/powerpc/pcrel-opt-ld-vector.c  |  36 +
 .../gcc.target/powerpc/pcrel-opt-st-df.c  |  36 +
 .../gcc.target/powerpc/pcrel-opt-st-di.c  |  37 +
 .../gcc.target/powerpc/pcrel-opt-st-hi.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-st-qi.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-st-sf.c  |  36 +
 .../gcc.target/powerpc/pcrel-opt-st-si.c  |  41 +
 .../gcc.target/powerpc/pcrel-opt-st-vector.c  |  36 +
 26 files changed, 2013 insertions(+), 9 deletions(-)
 create mode 100644 gcc/config/rs6000/pcrel-opt.c
 create mode 100644 gcc/config/rs6000/pcrel-opt.md
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-inc-di.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-df.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-di.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-hi.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-qi.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-sf.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-si.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-vector.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-df.c
 create mode 100644 gcc/testsuite/gcc.target/powerp

[PATCH] Add MODE_OPAQUE

2020-11-13 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

After discussion with Richard Sandiford on IRC, he suggested adding a
new mode class MODE_OPAQUE to deal with the problems (PR 96791) we had
been having with POImode/PXImode in powerpc target. This patch is the
accumulation of changes I needed to make to add this and make it useable
for the purposes of what power10 MMA needed.

MODE_OPAQUE modes allow you to have modes for which you can just
define loads and stores. By design, optimization does not expect to
know how to do arithmetic or subregs on these modes. This allows us to
have modes for multi-register vector operations where we don't want to
open Pandora's Box and define general arithmetic operations.

This patch will be followed by a target specific patch to change the
powerpc power10 MMA builtins to use opaque modes, and will also let use use
the vector pair loads/stores defined with that in the inline expansion
of memcpy/memmove, allowing me to fix PR 96791.

Regstrap in progress on ppc64le and x86_64, ok for trunk if successful?

Thanks,
   Aaron


 gcc/ChangeLog
 PR target/96791
 * mode-classes.def: Add MODE_OPAQUE.
 * machmode.def: Add OPAQUE_MODE.
 * tree.def: Add OPAQUE_TYPE for types that will use MODE_OPAQUE.
 * machmode.h: Add OPAQUE_MODE_P().
 * genmodes.c (complete_mode): Add MODE_OPAQUE.
 (opaque_mode): New function.
 * tree.c (tree_code_size): Add OPAQUE_TYPE.
 * tree.h: Add OPAQUE_TYPE_P().
 * tree-ssanames.c (get_nonzero_bits): OPAQUE_TYPE has an unknown
 number of nonzero bits.
 * stor-layout.c (int_mode_for_mode): Treat MODE_OPAQUE modes
 like BLKmode.
 * ira.c (find_moveable_pseudos): Treat MODE_OPAQUE modes more
 like integer/float modes here.
 * emit-rtl.c (init_emit_once): Create small rtx consts because we
 do want const0_rtx to work with opaque modes.
 * dbxout.c (dbxout_type): Treat OPAQUE_TYPE like VOID_TYPE.
 * tree-pretty-print.c (dump_generic_node): Treat OPAQUE_TYPE like
 like other types.

---
 gcc/dbxout.c|  1 +
 gcc/emit-rtl.c  |  3 +++
 gcc/genmodes.c  | 22 ++
 gcc/ira.c   |  3 ++-
 gcc/machmode.def|  3 +++
 gcc/machmode.h  |  4 
 gcc/mode-classes.def|  3 ++-
 gcc/stor-layout.c   |  3 +++
 gcc/tree-pretty-print.c |  1 +
 gcc/tree-ssanames.c |  3 +++
 gcc/tree.c  |  1 +
 gcc/tree.def|  6 ++
 gcc/tree.h  |  3 +++
 13 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/gcc/dbxout.c b/gcc/dbxout.c
index 5a20fdecdcc..eaee2f19ce0 100644
--- a/gcc/dbxout.c
+++ b/gcc/dbxout.c
@@ -1963,6 +1963,7 @@ dbxout_type (tree type, int full)
 case VOID_TYPE:
 case NULLPTR_TYPE:
 case LANG_TYPE:
+case OPAQUE_TYPE:
   /* For a void type, just define it as itself; i.e., "5=5".
 This makes us consider it defined
 without saying what it is.  The debugger will make it
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 3706f0a03fd..44a3b660bd0 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -6268,6 +6268,9 @@ init_emit_once (void)
   mode <= MAX_MODE_PARTIAL_INT;
   mode = (machine_mode)((int)(mode) + 1))
const_tiny_rtx[i][(int) mode] = GEN_INT (i);
+
+  FOR_EACH_MODE_IN_CLASS (mode, MODE_OPAQUE)
+   const_tiny_rtx[i][(int) mode] = GEN_INT (i);
 }
 
   const_tiny_rtx[3][(int) VOIDmode] = constm1_rtx;
diff --git a/gcc/genmodes.c b/gcc/genmodes.c
index bd78310ea24..369fe0aaec5 100644
--- a/gcc/genmodes.c
+++ b/gcc/genmodes.c
@@ -358,6 +358,14 @@ complete_mode (struct mode_data *m)
   m->component = 0;
   break;
 
+case MODE_OPAQUE:
+  /* Opaque modes have size and precision.  */
+  validate_mode (m, OPTIONAL, SET, UNSET, UNSET, UNSET);
+
+  m->ncomponents = 1;
+  m->component = 0;
+  break;
+
 case MODE_PARTIAL_INT:
   /* A partial integer mode uses ->component to say what the
 corresponding full-size integer mode is, and may also
@@ -588,6 +596,20 @@ make_int_mode (const char *name,
   m->precision = precision;
 }
 
+#define OPAQUE_MODE(N, B)  \
+  make_opaque_mode (#N, -1U, B, __FILE__, __LINE__)
+
+static void __attribute__((unused))
+make_opaque_mode (const char *name,
+ unsigned int precision,
+ unsigned int bytesize,
+ const char *file, unsigned int line)
+{
+  struct mode_data *m = new_mode (MODE_OPAQUE, name, file, line);
+  m->bytesize = bytesize;
+  m->precision = precision;
+}
+
 #define FRACT_MODE(N, Y, F) \
make_fixed_point_mode (MODE_FRACT, #N, Y, 0, F, __FILE__, __LINE__)
 
diff --git a/gcc/ira.c b/gcc/ira.c
index 050405f1833..d7a0482d121 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -4666,7 +4666,8 @@ find_moveable_pseudos (void)
|| !DF_REF_INSN_INFO (def)
|| HARD_REGISTER_NUM_P (regno)

Re: [PATCH] Add MODE_OPAQUE

2020-11-16 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

Richard,
  Thanks for the review. I think I have resolved everything, as follows:

* I was able to remove the const_tiny_rtx initialization for
MODE_OPAQUE.  If that becomes a problem it's a pretty simple matter to
use an UNSPEC to assign a constant to an opaque mode if necessary. The
whole point of this exercise was not to have this thing treated as an
integral type so I think it's best to leave this out if at all
possible.

* I ended up adding a precision to opaque after I had put in that hack
in get_nonzero_bits(). Now that it has a precision (equal to bitsize
as you say) this is no longer needed. The underlying problem there was
that without a precision, you ended up returning wi::shwi(-1,0) which
did not get treated as -1.

* I have documented OPAQUE_TYPE in generic.texi and MODE_OPAQUE in
rtl.texi.

OK for trunk if bootstrap/regtest passes on x86_64 and ppc64le?

Thanks,
Aaron

gcc/ChangeLog
PR target/96791
* mode-classes.def: Add MODE_OPAQUE.
* machmode.def: Add OPAQUE_MODE.
* tree.def: Add OPAQUE_TYPE for types that will use
MODE_OPAQUE.
* doc/generic.texi: Document OPAQUE_TYPE.
* doc/rtl.texi: Document MODE_OPAQUE.
* machmode.h: Add OPAQUE_MODE_P().
* genmodes.c (complete_mode): Add MODE_OPAQUE.
(opaque_mode): New function.
* tree.c (tree_code_size): Add OPAQUE_TYPE.
* tree.h: Add OPAQUE_TYPE_P().
* stor-layout.c (int_mode_for_mode): Treat MODE_OPAQUE modes
like BLKmode.
* ira.c (find_moveable_pseudos): Treat MODE_OPAQUE modes more
like integer/float modes here.
* dbxout.c (dbxout_type): Treat OPAQUE_TYPE like VOID_TYPE.
* tree-pretty-print.c (dump_generic_node): Treat OPAQUE_TYPE
like like other types.
---
 gcc/dbxout.c|  1 +
 gcc/doc/generic.texi|  8 
 gcc/doc/rtl.texi|  6 ++
 gcc/genmodes.c  | 22 ++
 gcc/ira.c   |  4 +++-
 gcc/machmode.def|  3 +++
 gcc/machmode.h  |  4 
 gcc/mode-classes.def|  3 ++-
 gcc/stor-layout.c   |  3 +++
 gcc/tree-pretty-print.c |  1 +
 gcc/tree.c  |  1 +
 gcc/tree.def|  6 ++
 gcc/tree.h  |  3 +++
 13 files changed, 63 insertions(+), 2 deletions(-)

diff --git a/gcc/dbxout.c b/gcc/dbxout.c
index 5a20fdecdcc..eaee2f19ce0 100644
--- a/gcc/dbxout.c
+++ b/gcc/dbxout.c
@@ -1963,6 +1963,7 @@ dbxout_type (tree type, int full)
 case VOID_TYPE:
 case NULLPTR_TYPE:
 case LANG_TYPE:
+case OPAQUE_TYPE:
   /* For a void type, just define it as itself; i.e., "5=5".
 This makes us consider it defined
 without saying what it is.  The debugger will make it
diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index 7373266c69f..7e7b74c6c8b 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -302,6 +302,7 @@ The elements are indexed from zero.
 @tindex ARRAY_TYPE
 @tindex RECORD_TYPE
 @tindex UNION_TYPE
+@tindex OPAQUE_TYPE
 @tindex UNKNOWN_TYPE
 @tindex OFFSET_TYPE
 @findex TYPE_UNQUALIFIED
@@ -487,6 +488,13 @@ assigned to that constant.  These constants will appear in 
the order in
 which they were declared.  The @code{TREE_TYPE} of each of these
 constants will be the type of enumeration type itself.
 
+@item OPAQUE_TYPE
+Used for things that use a @code{MODE_OPAQUE} mode class in the
+backend. Opaque types have a size and precision, and can be held in
+memory or registers. They are used when we do not want the compiler to
+make assumptions about the availability of other operations as would
+happen with integer types.
+
 @item BOOLEAN_TYPE
 Used to represent the @code{bool} type.
 
diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
index 22af5731bb6..cf892d425a2 100644
--- a/gcc/doc/rtl.texi
+++ b/gcc/doc/rtl.texi
@@ -1406,6 +1406,12 @@ Pointer bounds modes.  Used to represent values of 
pointer bounds type.
 Operations in these modes may be executed as NOPs depending on hardware
 features and environment setup.
 
+@findex MODE_OPAQUE
+@item MODE_OPAQUE
+This is a mode class for modes that don't want to provide operations
+other than moves between registers/memory. They have a size and
+precision and that's all.
+
 @findex MODE_RANDOM
 @item MODE_RANDOM
 This is a catchall mode class for modes which don't fit into the above
diff --git a/gcc/genmodes.c b/gcc/genmodes.c
index bd78310ea24..34b52fe41d6 100644
--- a/gcc/genmodes.c
+++ b/gcc/genmodes.c
@@ -358,6 +358,14 @@ complete_mode (struct mode_data *m)
   m->component = 0;
   break;
 
+case MODE_OPAQUE:
+  /* Opaque modes have size and precision.  */
+  validate_mode (m, OPTIONAL, SET, UNSET, UNSET, UNSET);
+
+  m->ncomponents = 1;
+  m->component = 0;
+  break;
+
 case MODE_PARTIAL_INT:
   /* A partial integer mode uses ->component to say what the
 corresponding full-size integer mode is, and may also
@@ -588,6 +596,20 @@ 

[PATCH,rs6000] Add insn types for fusion pairs

2021-04-26 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

This adds new values for insn attr type for p10 fusion. The genfusion.pl
script is modified to use them, and fusion.md regenerated to capture
the new patterns. There are also some formatting only changes to
fusion.md that apparently weren't captured after a previous commit
of genfusion.pl.

If bootstrap/regtest passes, OK for trunk and backport to 11.2?

Thanks,
Aaron

gcc/
* rs6000.md (define_attr "type"): Add types for fusion.
* genfusion.md (gen_ld_cmpi_p10): Use new fusion types.
(gen_2logical): Use new fusion types.
* fusion.md: Regenerate.
---
 gcc/config/rs6000/fusion.md| 288 -
 gcc/config/rs6000/genfusion.pl |   8 +-
 gcc/config/rs6000/rs6000.md|   4 +-
 3 files changed, 152 insertions(+), 148 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 56478fcae1d..6d71bc2df73 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -35,7 +35,7 @@ (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
(set (match_dup 2)
 (compare:CC (match_dup 0) (match_dup 3)))]
   ""
-  [(set_attr "type" "load")
+  [(set_attr "type" "fused_load_cmpi")
(set_attr "cost" "8")
(set_attr "length" "8")])
 
@@ -56,7 +56,7 @@ (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none"
(set (match_dup 2)
 (compare:CCUNS (match_dup 0) (match_dup 3)))]
   ""
-  [(set_attr "type" "load")
+  [(set_attr "type" "fused_load_cmpi")
(set_attr "cost" "8")
(set_attr "length" "8")])
 
@@ -77,7 +77,7 @@ (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none"
(set (match_dup 2)
 (compare:CC (match_dup 0) (match_dup 3)))]
   ""
-  [(set_attr "type" "load")
+  [(set_attr "type" "fused_load_cmpi")
(set_attr "cost" "8")
(set_attr "length" "8")])
 
@@ -98,7 +98,7 @@ (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none"
(set (match_dup 2)
 (compare:CCUNS (match_dup 0) (match_dup 3)))]
   ""
-  [(set_attr "type" "load")
+  [(set_attr "type" "fused_load_cmpi")
(set_attr "cost" "8")
(set_attr "length" "8")])
 
@@ -119,7 +119,7 @@ (define_insn_and_split "*lwa_cmpdi_cr0_SI_clobber_CC_none"
(set (match_dup 2)
 (compare:CC (match_dup 0) (match_dup 3)))]
   ""
-  [(set_attr "type" "load")
+  [(set_attr "type" "fused_load_cmpi")
(set_attr "cost" "8")
(set_attr "length" "8")])
 
@@ -140,7 +140,7 @@ (define_insn_and_split 
"*lwz_cmpldi_cr0_SI_clobber_CCUNS_none"
(set (match_dup 2)
 (compare:CCUNS (match_dup 0) (match_dup 3)))]
   ""
-  [(set_attr "type" "load")
+  [(set_attr "type" "fused_load_cmpi")
(set_attr "cost" "8")
(set_attr "length" "8")])
 
@@ -161,7 +161,7 @@ (define_insn_and_split "*lwa_cmpdi_cr0_SI_SI_CC_none"
(set (match_dup 2)
 (compare:CC (match_dup 0) (match_dup 3)))]
   ""
-  [(set_attr "type" "load")
+  [(set_attr "type" "fused_load_cmpi")
(set_attr "cost" "8")
(set_attr "length" "8")])
 
@@ -182,7 +182,7 @@ (define_insn_and_split "*lwz_cmpldi_cr0_SI_SI_CCUNS_none"
(set (match_dup 2)
 (compare:CCUNS (match_dup 0) (match_dup 3)))]
   ""
-  [(set_attr "type" "load")
+  [(set_attr "type" "fused_load_cmpi")
(set_attr "cost" "8")
(set_attr "length" "8")])
 
@@ -203,7 +203,7 @@ (define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign"
(set (match_dup 2)
 (compare:CC (match_dup 0) (match_dup 3)))]
   ""
-  [(set_attr "type" "load")
+  [(set_attr "type" "fused_load_cmpi")
(set_attr "cost" "8")
(set_attr "length" "8")])
 
@@ -224,7 +224,7 @@ (define_insn_and_split "*lwz_cmpldi_cr0_SI_EXTSI_CCUNS_zero"
(set (match_dup 2)
 (compare:CCUNS (match_dup 0) (match_dup 3)))]
   ""
-  [(set_attr "type" "load")
+  [(set_attr "type" "fused_load_cmpi")
(set_attr "cost" "8")
(set_attr "length" "8")])
 
@@ -245,7 +245,7 @@ (define_insn_and_split "*lha_cmpdi_cr0_HI_clobber_CC_sign"
(set (match_dup 2)
 (compare:CC (match_dup 0) (match_dup 3)))]
   ""
-  [(set_attr "type" "load")
+  [(set_attr "type" "fused_load_cmpi")
(set_attr "cost" "8")
(set_attr "length" "8")])
 
@@ -266,7 +266,7 @@ (define_insn_and_split 
"*lhz_cmpldi_cr0_HI_clobber_CCUNS_zero"
(set (match_dup 2)
 (compare:CCUNS (match_dup 0) (match_dup 3)))]
   ""
-  [(set_attr "type" "load")
+  [(set_attr "type" "fused_load_cmpi")
(set_attr "cost" "8")
(set_attr "length" "8")])
 
@@ -287,7 +287,7 @@ (define_insn_and_split "*lha_cmpdi_cr0_HI_EXTHI_CC_sign"
(set (match_dup 2)
 (compare:CC (match_dup 0) (match_dup 3)))]
   ""
-  [(set_attr "type" "load")
+  [(set_attr "type" "fused_load_cmpi")
(set_attr "cost" "8")
(set_attr "length" "8")])
 
@@ -308,7 +308,7 @@ (define_insn_and_split "*lhz_cmpldi_cr0_HI_EXTHI_CCUNS_zero"
(set (match_dup 2)
 (compare:CCUNS (match_dup 0) (match_dup 3)))]
   ""
-  [(set_attr "type" "load")
+  [(set_attr "type" "fused_load_cmpi")
(set_attr "cost" "8")
   

[PATCH,rs6000] Test cases for p10 fusion patterns

2021-04-26 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

This adds some test cases to make sure that the combine patterns for p10
fusion are working.

OK for trunk?

gcc/testsuite/ChangeLog:
* gcc.target/powerpc/fusion-p10-ldcmpi.c: New file.
* gcc.target/powerpc/fusion-p10-2logical.c: New file.
---
 .../gcc.target/powerpc/fusion-p10-2logical.c  | 205 ++
 .../gcc.target/powerpc/fusion-p10-ldcmpi.c|  66 ++
 2 files changed, 271 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/fusion-p10-2logical.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c

diff --git a/gcc/testsuite/gcc.target/powerpc/fusion-p10-2logical.c 
b/gcc/testsuite/gcc.target/powerpc/fusion-p10-2logical.c
new file mode 100644
index 000..9a205373505
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fusion-p10-2logical.c
@@ -0,0 +1,205 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } } */
+/* { dg-options "-mdejagnu-cpu=power10 -O3 -dp" } */
+
+#include 
+#include 
+
+/* and/andc/eqv/nand/nor/or/orc/xor */
+#define AND(a,b) ((a)&(b))
+#define ANDC1(a,b) ((a)&((~b)))
+#define ANDC2(a,b) ((~(a))&(b))
+#define EQV(a,b) (~((a)^(b)))
+#define NAND(a,b) (~((a)&(b)))
+#define NOR(a,b) (~((a)|(b)))
+#define OR(a,b) ((a)|(b))
+#define ORC1(a,b) ((a)|((~b)))
+#define ORC2(a,b) ((~(a))|(b))
+#define XOR(a,b) ((a)^(b))
+#define TEST1(type, func)  
\
+  type func ## _and_T_ ## type (type a, type b, type c) { return 
AND(func(a,b),c); } \
+  type func ## _andc1_T_   ## type (type a, type b, type c) { return 
ANDC1(func(a,b),c); } \
+  type func ## _andc2_T_   ## type (type a, type b, type c) { return 
ANDC2(func(a,b),c); } \
+  type func ## _eqv_T_ ## type (type a, type b, type c) { return 
EQV(func(a,b),c); } \
+  type func ## _nand_T_## type (type a, type b, type c) { return 
NAND(func(a,b),c); } \
+  type func ## _nor_T_ ## type (type a, type b, type c) { return 
NOR(func(a,b),c); } \
+  type func ## _or_T_  ## type (type a, type b, type c) { return 
OR(func(a,b),c); } \
+  type func ## _orc1_T_## type (type a, type b, type c) { return 
ORC1(func(a,b),c); } \
+  type func ## _orc2_T_## type (type a, type b, type c) { return 
ORC2(func(a,b),c); } \
+  type func ## _xor_T_ ## type (type a, type b, type c) { return 
XOR(func(a,b),c); } \
+  type func ## _rev_and_T_ ## type (type a, type b, type c) { return 
AND(c,func(a,b)); } \
+  type func ## _rev_andc1_T_   ## type (type a, type b, type c) { return 
ANDC1(c,func(a,b)); } \
+  type func ## _rev_andc2_T_   ## type (type a, type b, type c) { return 
ANDC2(c,func(a,b)); } \
+  type func ## _rev_eqv_T_ ## type (type a, type b, type c) { return 
EQV(c,func(a,b)); } \
+  type func ## _rev_nand_T_## type (type a, type b, type c) { return 
NAND(c,func(a,b)); } \
+  type func ## _rev_nor_T_ ## type (type a, type b, type c) { return 
NOR(c,func(a,b)); } \
+  type func ## _rev_or_T_  ## type (type a, type b, type c) { return 
OR(c,func(a,b)); } \
+  type func ## _rev_orc1_T_## type (type a, type b, type c) { return 
ORC1(c,func(a,b)); } \
+  type func ## _rev_orc2_T_## type (type a, type b, type c) { return 
ORC2(c,func(a,b)); } \
+  type func ## _rev_xor_T_ ## type (type a, type b, type c) { return 
XOR(c,func(a,b)); }
+#define TEST(type)\
+  TEST1(type,AND) \
+  TEST1(type,ANDC1)   \
+  TEST1(type,ANDC2)   \
+  TEST1(type,EQV) \
+  TEST1(type,NAND)\
+  TEST1(type,NOR) \
+  TEST1(type,OR)  \
+  TEST1(type,ORC1)\
+  TEST1(type,ORC2)\
+  TEST1(type,XOR)
+
+typedef vector bool char vboolchar_t;
+typedef vector unsigned int vuint_t;
+
+TEST(uint8_t);
+TEST(int8_t);
+TEST(uint16_t);
+TEST(int16_t);
+TEST(uint32_t);
+TEST(int32_t);
+TEST(uint64_t);
+TEST(int64_t);
+TEST(vboolchar_t);
+TEST(vuint_t);
+
+/* Recreate with:
+   grep ' \*fuse_' fusion-p10-2logical.s|sed -e 's,^.*\*,,' |sort -k 7,7 |uniq 
-c|awk '{l=30-length($2); printf("/%s* { %s { scan-assembler-times \"%s\"%-*s   
 %4d } } *%s/\n","","dg-final",$2,l,"",$1,"");}'
+ */
+  
+/* { dg-final { scan-assembler-times "fuse_and_and/1"  
16 } } */
+/* { dg-final { scan-assembler-times "fuse_and_and/2"  
16 } } */
+/* { dg-final { scan-assembler-times "fuse_andc_and/0" 
16 } } */
+/* { dg-final { scan-assembler-times "fuse_andc_and/1" 
26 } } */
+/* { dg-final { scan-assembler-times "fuse_andc_and/2" 
48 } } */
+/* { dg-final { scan-assembler-times "fuse_andc_and/3" 
 6 } } */
+/* { dg-final { scan-assembler-times "fuse_andc_or/0"  
16 } } */
+/* { dg-final { scan-assembler-times "fuse_andc_or/1"  
16 } } */
+/* { dg-final { scan-assembler-times "fuse_andc_or/2"  
32 } } */
+/* { dg-final { scan-assembl

[PATCH,rs6000 0/2] p10 add-add and add-logical fusion series

2021-04-26 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

Two more sets of combine patterns for p10 fusion. These require 
the "Add insn types for fusion pairs" patch I posted earlier today.

If ok I would like to put these in gcc 12 trunk and backport for 11.2.

Thanks,
   Aaron

Aaron Sawdey (2):
  combine patterns for add-add fusion
  Fusion patterns for add-logical/logical-add

 gcc/config/rs6000/fusion.md   | 908 +-
 gcc/config/rs6000/genfusion.pl| 127 ++-
 gcc/config/rs6000/rs6000-cpus.def |   8 +-
 gcc/config/rs6000/rs6000.c|   9 +
 gcc/config/rs6000/rs6000.opt  |  12 +
 .../gcc.target/powerpc/fusion-p10-addadd.c|  41 +
 .../gcc.target/powerpc/fusion-p10-logadd.c|  98 ++
 7 files changed, 925 insertions(+), 278 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/fusion-p10-addadd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/fusion-p10-logadd.c

-- 
2.27.0



[PATCH,rs6000 1/2] combine patterns for add-add fusion

2021-04-26 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

This patch adds a function to genfusion.pl to add a couple
more patterns so combine can do fusion of pairs of add and
vaddudm instructions.

gcc/ChangeLog:

* gcc/config/rs6000/genfusion.pl (gen_addadd): New function.
* gcc/config/rs6000/fusion.md: Regenerate file.
* gcc/config/rs6000/rs6000-cpus.def: Add
OPTION_MASK_P10_FUSION_2ADD to masks.
* gcc/config/rs6000/rs6000.c (rs6000_option_override_internal):
Handle default value of OPTION_MASK_P10_FUSION_2ADD.
* gcc/config/rs6000/rs6000.opt: Add -mpower10-fusion-2add.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/fusion-p10-addadd.c: New file.
---
 gcc/config/rs6000/fusion.md   | 36 +++
 gcc/config/rs6000/genfusion.pl| 44 +++
 gcc/config/rs6000/rs6000-cpus.def |  4 +-
 gcc/config/rs6000/rs6000.c|  3 ++
 gcc/config/rs6000/rs6000.opt  |  4 ++
 .../gcc.target/powerpc/fusion-p10-addadd.c| 41 +
 6 files changed, 131 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/fusion-p10-addadd.c

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 6d71bc2df73..6dfe1fa4508 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2658,3 +2658,39 @@ (define_insn "*fuse_vxor_vxor"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
(set_attr "length" "8")])
+
+;; add-add fusion pattern generated by gen_addadd
+(define_insn "*fuse_add_add"
+  [(set (match_operand:GPR 3 "gpc_reg_operand" "=0,1,&r,r")
+(plus:GPR
+   (plus:GPR (match_operand:GPR 0 "gpc_reg_operand" "r,r,r,r")
+ (match_operand:GPR 1 "gpc_reg_operand" "%r,r,r,r"))
+   (match_operand:GPR 2 "gpc_reg_operand" "r,r,r,r")))
+   (clobber (match_scratch:GPR 4 "=X,X,X,&r"))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_2ADD)"
+  "@
+   add %3,%1,%0\;add %3,%3,%2
+   add %3,%1,%0\;add %3,%3,%2
+   add %3,%1,%0\;add %3,%3,%2
+   add %4,%1,%0\;add %3,%4,%2"
+  [(set_attr "type" "fuse_arithlog")
+   (set_attr "cost" "6")
+   (set_attr "length" "8")])
+
+;; vaddudm-vaddudm fusion pattern generated by gen_addadd
+(define_insn "*fuse_vaddudm_vaddudm"
+  [(set (match_operand:V2DI 3 "altivec_register_operand" "=0,1,&v,v")
+(plus:V2DI
+   (plus:V2DI (match_operand:V2DI 0 "altivec_register_operand" 
"v,v,v,v")
+ (match_operand:V2DI 1 "altivec_register_operand" 
"%v,v,v,v"))
+   (match_operand:V2DI 2 "altivec_register_operand" "v,v,v,v")))
+   (clobber (match_scratch:V2DI 4 "=X,X,X,&v"))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_2ADD)"
+  "@
+   vaddudm %3,%1,%0\;vaddudm %3,%3,%2
+   vaddudm %3,%1,%0\;vaddudm %3,%3,%2
+   vaddudm %3,%1,%0\;vaddudm %3,%3,%2
+   vaddudm %4,%1,%0\;vaddudm %3,%4,%2"
+  [(set_attr "type" "fuse_vec")
+   (set_attr "cost" "6")
+   (set_attr "length" "8")])
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index ce48fd94f95..8ed3c3617ec 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -240,8 +240,52 @@ EOF
   }
 }
 
+sub gen_addadd
+{
+my ($kind, $vchr, $op, $ty, $mode, $pred, $constraint);
+  KIND: foreach $kind ('scalar','vector') {
+  if ( $kind eq 'vector' ) {
+ $vchr = "v";
+ $op = "vaddudm";
+ $ty = "fuse_vec";
+ $mode = "V2DI";
+ $pred = "altivec_register_operand";
+ $constraint = "v";
+  } else {
+ $vchr = "";
+ $op = "add";
+ $ty = "fuse_arithlog";
+ $mode = "GPR";
+ $pred = "gpc_reg_operand";
+ $constraint = "r";
+  }
+my $c4 = "${constraint},${constraint},${constraint},${constraint}";
+print <<"EOF";
+
+;; ${op}-${op} fusion pattern generated by gen_addadd
+(define_insn "*fuse_${op}_${op}"
+  [(set (match_operand:${mode} 3 "${pred}" "=0,1,&${constraint},${constraint}")
+(plus:${mode}
+   (plus:${mode} (match_operand:${mode} 0 "${pred}" "${c4}")
+ (match_operand:${mode} 1 "${pred}" "%${c4}"))
+   (match_operand:${mode} 2 "${pred}" "${c4}")))
+   (clobber (match_scratch:${mode} 4 "=X,X,X,&${constraint}"))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_2ADD)"
+  "@
+   ${op} %3,%1,%0\\;${op} %3,%3,%2
+   ${op} %3,%1,%0\\;${op} %3,%3,%2
+   ${op} %3,%1,%0\\;${op} %3,%3,%2
+   ${op} %4,%1,%0\\;${op} %3,%4,%2"
+  [(set_attr "type" "${ty}")
+   (set_attr "cost" "6")
+   (set_attr "length" "8")])
+EOF
+  }
+}
+
 gen_ld_cmpi_p10();
 gen_2logical();
+gen_addadd();
 
 exit(0);
 
diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index cbbb42c1b3a..d46a91dd11b 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -85,7 +85,8 @@
 | OTHER_POWER10_MASKS  \
 |

[PATCH,rs6000 2/2] Fusion patterns for add-logical/logical-add

2021-04-26 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

This patch modifies the function in genfusion.pl for generating
the logical-logical patterns so that it can also generate the
add-logical and logical-add patterns which are very similar.

gcc/ChangeLog:
* config/rs6000/genfusion.pl (gen_logical_addsubf): Refactor to
add generation of logical-add and add-logical fusion pairs.
* config/rs6000/rs6000-cpus.def: Add new fusion to ISA 3.1 mask
and powerpc mask.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Turn on
logical-add and add-logical fusion by default.
* config/rs6000.opt: Add -mpower10-fusion-logical-add and
-mpower10-fusion-add-logical options.
* config/rs6000/fusion.md: Regenerate file.

gcc/testsuite/ChangeLog:
* gcc.target/powerpc/fusion-p10-logadd.c: New file.
---
 gcc/config/rs6000/fusion.md   | 876 --
 gcc/config/rs6000/genfusion.pl|  87 +-
 gcc/config/rs6000/rs6000-cpus.def |   4 +
 gcc/config/rs6000/rs6000.c|   6 +
 gcc/config/rs6000/rs6000.opt  |   8 +
 .../gcc.target/powerpc/fusion-p10-logadd.c|  98 ++
 6 files changed, 798 insertions(+), 281 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/fusion-p10-logadd.c

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 6dfe1fa4508..6c7c94c44c1 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -355,11 +355,11 @@ (define_insn_and_split "*lbz_cmpldi_cr0_QI_GPR_CCUNS_zero"
(set_attr "length" "8")])
 
 
-;; logical-logical fusion pattern generated by gen_2logical
+;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; scalar and -> and
 (define_insn "*fuse_and_and"
   [(set (match_operand:GPR 3 "gpc_reg_operand" "=0,1,&r,r")
-(and:GPR (and:GPR (match_operand:GPR 0 "gpc_reg_operand" "r,r,r,r") 
+(and:GPR (and:GPR (match_operand:GPR 0 "gpc_reg_operand" "r,r,r,r")
   (match_operand:GPR 1 "gpc_reg_operand" "%r,r,r,r"))
  (match_operand:GPR 2 "gpc_reg_operand" "r,r,r,r")))
(clobber (match_scratch:GPR 4 "=X,X,X,&r"))]
@@ -373,11 +373,11 @@ (define_insn "*fuse_and_and"
(set_attr "cost" "6")
(set_attr "length" "8")])
 
-;; logical-logical fusion pattern generated by gen_2logical
+;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; scalar andc -> and
 (define_insn "*fuse_andc_and"
   [(set (match_operand:GPR 3 "gpc_reg_operand" "=0,1,&r,r")
-(and:GPR (and:GPR (not:GPR (match_operand:GPR 0 "gpc_reg_operand" 
"r,r,r,r")) 
+(and:GPR (and:GPR (not:GPR (match_operand:GPR 0 "gpc_reg_operand" 
"r,r,r,r"))
   (match_operand:GPR 1 "gpc_reg_operand" "r,r,r,r"))
  (match_operand:GPR 2 "gpc_reg_operand" "r,r,r,r")))
(clobber (match_scratch:GPR 4 "=X,X,X,&r"))]
@@ -391,11 +391,11 @@ (define_insn "*fuse_andc_and"
(set_attr "cost" "6")
(set_attr "length" "8")])
 
-;; logical-logical fusion pattern generated by gen_2logical
+;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; scalar eqv -> and
 (define_insn "*fuse_eqv_and"
   [(set (match_operand:GPR 3 "gpc_reg_operand" "=0,1,&r,r")
-(and:GPR (not:GPR (xor:GPR (match_operand:GPR 0 "gpc_reg_operand" 
"r,r,r,r") 
+(and:GPR (not:GPR (xor:GPR (match_operand:GPR 0 "gpc_reg_operand" 
"r,r,r,r")
   (match_operand:GPR 1 "gpc_reg_operand" "r,r,r,r")))
  (match_operand:GPR 2 "gpc_reg_operand" "r,r,r,r")))
(clobber (match_scratch:GPR 4 "=X,X,X,&r"))]
@@ -409,11 +409,11 @@ (define_insn "*fuse_eqv_and"
(set_attr "cost" "6")
(set_attr "length" "8")])
 
-;; logical-logical fusion pattern generated by gen_2logical
+;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; scalar nand -> and
 (define_insn "*fuse_nand_and"
   [(set (match_operand:GPR 3 "gpc_reg_operand" "=0,1,&r,r")
-(and:GPR (ior:GPR (not:GPR (match_operand:GPR 0 "gpc_reg_operand" 
"r,r,r,r")) 
+(and:GPR (ior:GPR (not:GPR (match_operand:GPR 0 "gpc_reg_operand" 
"r,r,r,r"))
   (not:GPR (match_operand:GPR 1 "gpc_reg_operand" 
"r,r,r,r")))
  (match_operand:GPR 2 "gpc_reg_operand" "r,r,r,r")))
(clobber (match_scratch:GPR 4 "=X,X,X,&r"))]
@@ -427,11 +427,11 @@ (define_insn "*fuse_nand_and"
(set_attr "cost" "6")
(set_attr "length" "8")])
 
-;; logical-logical fusion pattern generated by gen_2logical
+;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; scalar nor -> and
 (define_insn "*fuse_nor_and"
   [(set (match_operand:GPR 3 "gpc_reg_operand" "=0,1,&r,r")
-(and:GPR (and:GPR (not:GPR (match_operand:GPR 0 "gpc_reg_operand" 
"r,r,r,r")) 
+(and:GPR (and:GPR (not:GPR (match_operand:GPR 0 "gpc_reg_operand" 
"r,r,r,r"))
   (not:GPR (match_operand:GPR 1 "gpc_reg_ope

[PATCH] rs6000: add option -mblock-ops-unaligned-vsx

2020-07-24 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

This option is mostly being added to provide -mno-block-ops-unaligned-vsx.
The default is set the same as -mefficient-unaligned-vsx. This option will
control the use of unaligned VSX loads/stores in the inline expansion
of memcpy() and memmove(). The use case for this would be if you're
compiling code that is doing a memcpy to memory mapped device memory
that is cache-inhibited. On some powerpc processors this requires the
unaligned vsx ops to be emulated by the kernel which is very slow.

I'll be submitting additional patches to change the inline expansion
of memcpy/memmove based on this option.

Ok for trunk if regstrap passes on powerpc64le power8?

Thanks!
   Aaron

gcc/ChangeLog:

* config/rs6000/rs6000.c (rs6000_option_override_internal):
Set the default value for the option.
* config/rs6000/rs6000.opt: Add -mblock-ops-unaligned-vsx.
* doc/invoke.texi: Document -mblock-ops-unaligned-vsx.
---
 gcc/config/rs6000/rs6000.c   | 12 
 gcc/config/rs6000/rs6000.opt |  4 
 gcc/doc/invoke.texi  |  8 
 3 files changed, 24 insertions(+)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 6bea544d26a..d6c9bd8de21 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -3979,6 +3979,16 @@ rs6000_option_override_internal (bool global_init_p)
}
 }
 
+  if (!(rs6000_isa_flags_explicit & OPTION_MASK_BLOCK_OPS_UNALIGNED_VSX))
+{
+  if (TARGET_EFFICIENT_UNALIGNED_VSX)
+   rs6000_isa_flags |= OPTION_MASK_BLOCK_OPS_UNALIGNED_VSX;
+  else
+   rs6000_isa_flags &= ~OPTION_MASK_BLOCK_OPS_UNALIGNED_VSX;
+}
+
+  if (TARGET_BLOCK_OPS_UNALIGNED_VSX) 
printf("TARGET_BLOCK_OPS_UNALIGNED_VSX\n");
+
   /* Use long double size to select the appropriate long double.  We use
  TYPE_PRECISION to differentiate the 3 different long double types.  We map
  128 into the precision used for TFmode.  */
@@ -23167,6 +23177,8 @@ struct rs6000_opt_mask {
 static struct rs6000_opt_mask const rs6000_opt_masks[] =
 {
   { "altivec", OPTION_MASK_ALTIVEC,false, true  },
+  { "block-ops-unaligned-vsx",  OPTION_MASK_BLOCK_OPS_UNALIGNED_VSX,
+false, true  },
   { "cmpb",OPTION_MASK_CMPB,   false, true  },
   { "crypto",  OPTION_MASK_CRYPTO, false, true  },
   { "direct-move", OPTION_MASK_DIRECT_MOVE,false, true  },
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 6b426f2aaf1..22b4e456aad 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -324,6 +324,10 @@ mblock-move-inline-limit=
 Target Report Var(rs6000_block_move_inline_limit) Init(0) RejectNegative 
Joined UInteger Save
 Max number of bytes to move inline.
 
+mblock-ops-unaligned-vsx
+Target Report Mask(BLOCK_OPS_UNALIGNED_VSX) Var(rs6000_isa_flags)
+Generate unaligned VSX load/store for inline expansion of memcpy/memmove.
+
 mblock-compare-inline-limit=
 Target Report Var(rs6000_block_compare_inline_limit) Init(63) RejectNegative 
Joined UInteger Save
 Max number of bytes to compare without loops.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index ba18e05fb1a..5449c338370 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1182,6 +1182,7 @@ See RS/6000 and PowerPC Options.
 -mblock-move-inline-limit=@var{num} @gol
 -mblock-compare-inline-limit=@var{num} @gol
 -mblock-compare-inline-loop-limit=@var{num} @gol
+-mno-block-ops-unaligned-vsx @gol
 -mstring-compare-inline-limit=@var{num} @gol
 -misel  -mno-isel @gol
 -mvrsave  -mno-vrsave @gol
@@ -27023,6 +27024,13 @@ store instructions when the option 
@option{-mcpu=future} is used.
 @opindex mno-mma
 Generate (do not generate) the MMA instructions when the option
 @option{-mcpu=future} is used.
+
+@item -mblock-ops-unaligned-vsx
+@itemx -mno-block-ops-unaligned-vsx
+@opindex block-ops-unaligned-vsx
+@opindex no-block-ops-unaligned-vsx
+Generate (do not generate) unaligned vsx loads and stores for
+inline expansion of @code{memcpy} and @code{memmove}.
 @end table
 
 @node RX Options
-- 
2.25.1



[PATCH][PR target/94542]Don't allow PC-relative addressing for TLS data

2020-04-10 Thread acsawdey via Gcc-patches
One of the things that address_to_insn_form() is used for is determining 
whether a PC-relative addressing instruction could be used. In 
particular predicate pcrel_external_address and function 
prefixed_paddi_p() both use it for this purpose. So what emerged in 
PR/94542 is that it should be looking to see if the associated 
symbol_ref is a TLS symbol of some kind. TLS symbols cannot be addressed 
with PC-relative. This patch fixes both places in address_to_insn_form() 
where it is looking at a symbol_ref.


Regression tests passed with trunk 
38e62001c576b8c6ba2e08eb4673d69ec4c5b0f9 on ppc64le power9, and 
PC-relative code is now correct. OK for trunk?


Thanks!
   Aaron

2020-04-10  Aaron Sawdey  

PR target/94542
* config/rs6000/rs6000.c (address_to_insn_form): Do not attempt to
use PC-relative addressing for TLS references.
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 2b6613bcb7e..c77e60a718f 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -24824,15 +24824,21 @@ address_to_insn_form (rtx addr,
   if (GET_RTX_CLASS (GET_CODE (addr)) == RTX_AUTOINC)
 return INSN_FORM_UPDATE;
 
-  /* Handle PC-relative symbols and labels.  Check for both local and external
- symbols.  Assume labels are always local.  */
+  /* Handle PC-relative symbols and labels.  Check for both local and
+ external symbols.  Assume labels are always local. TLS symbols
+ are not PC-relative.  */
   if (TARGET_PCREL)
 {
-  if (SYMBOL_REF_P (addr) && !SYMBOL_REF_LOCAL_P (addr))
-	return INSN_FORM_PCREL_EXTERNAL;
-
-  if (SYMBOL_REF_P (addr) || LABEL_REF_P (addr))
+  if (LABEL_REF_P (addr))
 	return INSN_FORM_PCREL_LOCAL;
+
+  if (SYMBOL_REF_P (addr) && !SYMBOL_REF_TLS_MODEL (addr))
+	{
+	  if (!SYMBOL_REF_LOCAL_P (addr))
+	return INSN_FORM_PCREL_EXTERNAL;
+	  else
+	return INSN_FORM_PCREL_LOCAL;
+	}
 }
 
   if (GET_CODE (addr) == CONST)
@@ -24866,14 +24872,19 @@ address_to_insn_form (rtx addr,
 return INSN_FORM_BAD;
 
   /* Check for local and external PC-relative addresses.  Labels are always
- local.  */
+ local.  TLS symbols are not PC-relative.  */
   if (TARGET_PCREL)
 {
-  if (SYMBOL_REF_P (op0) && !SYMBOL_REF_LOCAL_P (op0))
-	return INSN_FORM_PCREL_EXTERNAL;
-
-  if (SYMBOL_REF_P (op0) || LABEL_REF_P (op0))
+  if (LABEL_REF_P (op0))
 	return INSN_FORM_PCREL_LOCAL;
+
+  if (SYMBOL_REF_P (op0) && !SYMBOL_REF_TLS_MODEL (op0))
+	{
+	  if (!SYMBOL_REF_LOCAL_P (op0))
+	return INSN_FORM_PCREL_EXTERNAL;
+	  else
+	return INSN_FORM_PCREL_LOCAL;
+	}
 }
 
   /* If it isn't PC-relative, the address must use a base register.  */


Re: [PATCH][PR target/94542]Don't allow PC-relative addressing for TLS data

2020-04-13 Thread acsawdey via Gcc-patches

On 2020-04-13 10:08, will schmidt wrote:

On Fri, 2020-04-10 at 18:00 -0500, acsawdey via Gcc-patches wrote:

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 2b6613bcb7e..c77e60a718f 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -24824,15 +24824,21 @@ address_to_insn_form (rtx addr,
   if (GET_RTX_CLASS (GET_CODE (addr)) == RTX_AUTOINC)
 return INSN_FORM_UPDATE;

-  /* Handle PC-relative symbols and labels.  Check for both local and 
external

- symbols.  Assume labels are always local.  */
+  /* Handle PC-relative symbols and labels.  Check for both local and
+ external symbols.  Assume labels are always local. TLS symbols
+ are not PC-relative.  */


Does the assumption need a qualifier for target rs6000?  (or some 
combination
of TLS/PC-relative?)   There are users of the LABEL_REF_NONLOCAL_P() 
for

mips,pa,sparc targets.


Yeah I think it's reasonable to say "TLS symbols are not PC-relative on 
target

rs6000." That was certainly my intent.

Aaron


[PATCH,rs6000] Fusion patterns for logical-logical

2020-12-10 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

This patch adds a new function to genfusion.pl to generate patterns for
logical-logical fusion. They are enabled by default for power10 and can
be disabled by -mno-power10-fusion-2logical or -mno-power10-fusion.

This patch builds on top of the load-cmpi patch posted earlier this week.

Bootstrap passed on ppc64le/power10, if regtests pass, ok for trunk?

gcc/ChangeLog
* config/rs6000/genfusion.pl (gen_2logical): New function to
generate patterns for logical-logical fusion.
* config/rs6000/fusion.md: Regenerated patterns.
* config/rs6000/rs6000-cpus.def: Add
OPTION_MASK_P10_FUSION_2LOGICAL.
* config/rs6000/rs6000.c (rs6000_option_override_internal):
Enable logical-logical fusion for p10.
* config/rs6000/rs6000.opt: Add -mpower10-fusion-2logical.
---
 gcc/config/rs6000/fusion.md   | 2176 +
 gcc/config/rs6000/genfusion.pl|   89 ++
 gcc/config/rs6000/rs6000-cpus.def |4 +-
 gcc/config/rs6000/rs6000.c|3 +
 gcc/config/rs6000/rs6000.opt  |4 +
 5 files changed, 2275 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index a4d3a6ae7f3..1ddbe7fe3d2 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -355,3 +355,2179 @@ (define_insn_and_split 
"*lbz_cmpldi_cr0_QI_GPR_CCUNS_zero"
(set_attr "cost" "8")
(set_attr "length" "8")])
 
+
+;; logical-logical fusion pattern generated by gen_2logical
+;; kind: scalar outer: and op and rtl and inv 0 comp 0
+;; inner: and op and rtl and inv 0 comp 0
+(define_insn "*fuse_and_and"
+  [(set (match_operand:GPR 3 "gpc_reg_operand" "=&r,0,1,r")
+(and:GPR (and:GPR (match_operand:GPR 0 "gpc_reg_operand" "r,r,r,r") 
(match_operand:GPR 1 "gpc_reg_operand" "%r,r,r,r")) (match_operand:GPR 2 
"gpc_reg_operand" "r,r,r,r")))
+   (clobber (match_scratch:GPR 4 "=X,X,X,r"))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_2LOGICAL)"
+  "@
+   and %3,%1,%0\;and %3,%3,%2
+   and %0,%1,%0\;and %0,%0,%2
+   and %1,%1,%0\;and %1,%1,%2
+   and %4,%1,%0\;and %3,%4,%2"
+  [(set_attr "type" "logical")
+   (set_attr "cost" "6")
+   (set_attr "length" "8")])
+
+;; logical-logical fusion pattern generated by gen_2logical
+;; kind: scalar outer: and op and rtl and inv 0 comp 0
+;; inner: andc op andc rtl and inv 0 comp 1
+(define_insn "*fuse_andc_and"
+  [(set (match_operand:GPR 3 "gpc_reg_operand" "=&r,0,1,r")
+(and:GPR (and:GPR (not:GPR (match_operand:GPR 0 "gpc_reg_operand" 
"r,r,r,r")) (match_operand:GPR 1 "gpc_reg_operand" "r,r,r,r")) 
(match_operand:GPR 2 "gpc_reg_operand" "r,r,r,r")))
+   (clobber (match_scratch:GPR 4 "=X,X,X,r"))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_2LOGICAL)"
+  "@
+   andc %3,%1,%0\;and %3,%3,%2
+   andc %0,%1,%0\;and %0,%0,%2
+   andc %1,%1,%0\;and %1,%1,%2
+   andc %4,%1,%0\;and %3,%4,%2"
+  [(set_attr "type" "logical")
+   (set_attr "cost" "6")
+   (set_attr "length" "8")])
+
+;; logical-logical fusion pattern generated by gen_2logical
+;; kind: scalar outer: and op and rtl and inv 0 comp 0
+;; inner: eqv op eqv rtl xor inv 1 comp 0
+(define_insn "*fuse_eqv_and"
+  [(set (match_operand:GPR 3 "gpc_reg_operand" "=&r,0,1,r")
+(and:GPR (not:GPR (xor:GPR (match_operand:GPR 0 "gpc_reg_operand" 
"r,r,r,r") (match_operand:GPR 1 "gpc_reg_operand" "r,r,r,r"))) 
(match_operand:GPR 2 "gpc_reg_operand" "r,r,r,r")))
+   (clobber (match_scratch:GPR 4 "=X,X,X,r"))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_2LOGICAL)"
+  "@
+   eqv %3,%1,%0\;and %3,%3,%2
+   eqv %0,%1,%0\;and %0,%0,%2
+   eqv %1,%1,%0\;and %1,%1,%2
+   eqv %4,%1,%0\;and %3,%4,%2"
+  [(set_attr "type" "logical")
+   (set_attr "cost" "6")
+   (set_attr "length" "8")])
+
+;; logical-logical fusion pattern generated by gen_2logical
+;; kind: scalar outer: and op and rtl and inv 0 comp 0
+;; inner: nand op nand rtl ior inv 0 comp 3
+(define_insn "*fuse_nand_and"
+  [(set (match_operand:GPR 3 "gpc_reg_operand" "=&r,0,1,r")
+(and:GPR (ior:GPR (not:GPR (match_operand:GPR 0 "gpc_reg_operand" 
"r,r,r,r")) (not:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r,r,r"))) 
(match_operand:GPR 2 "gpc_reg_operand" "r,r,r,r")))
+   (clobber (match_scratch:GPR 4 "=X,X,X,r"))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_2LOGICAL)"
+  "@
+   nand %3,%1,%0\;and %3,%3,%2
+   nand %0,%1,%0\;and %0,%0,%2
+   nand %1,%1,%0\;and %1,%1,%2
+   nand %4,%1,%0\;and %3,%4,%2"
+  [(set_attr "type" "logical")
+   (set_attr "cost" "6")
+   (set_attr "length" "8")])
+
+;; logical-logical fusion pattern generated by gen_2logical
+;; kind: scalar outer: and op and rtl and inv 0 comp 0
+;; inner: nor op nor rtl and inv 0 comp 3
+(define_insn "*fuse_nor_and"
+  [(set (match_operand:GPR 3 "gpc_reg_operand" "=&r,0,1,r")
+(and:GPR (and:GPR (not:GPR (match_operand:GPR 0 "gpc_reg_operand" 
"r,r,r,r")) (not:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r,r,r"))) 
(match_operand:GPR 2 "gpc_reg_operand" "r,r,r,r

[PATCH,rs6000] Test cases for p10 fusion patterns

2020-12-11 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

This adds some test cases to make sure that the combine patterns for p10
fusion are working.

These test cases pass on power10. OK for trunk after the 2 previous patches
for the fusion patterns go in?

Thanks!
   Aaron

gcc/testsuite/ChangeLog:
* gcc.target/powerpc/fusion-p10-ldcmpi.c: New file.
* gcc.target/powerpc/fusion-p10-2logical.c: New file.
---
 .../gcc.target/powerpc/fusion-p10-2logical.c  | 201 ++
 .../gcc.target/powerpc/fusion-p10-ldcmpi.c|  66 ++
 2 files changed, 267 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/fusion-p10-2logical.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c

diff --git a/gcc/testsuite/gcc.target/powerpc/fusion-p10-2logical.c 
b/gcc/testsuite/gcc.target/powerpc/fusion-p10-2logical.c
new file mode 100644
index 000..cfe8f6c679a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fusion-p10-2logical.c
@@ -0,0 +1,201 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } } */
+/* { dg-options "-mdejagnu-cpu=power10 -O3 -dp" } */
+
+#include 
+#include 
+
+/* and/andc/eqv/nand/nor/or/orc/xor */
+#define AND(a,b) ((a)&(b))
+#define ANDC1(a,b) ((a)&((~b)))
+#define ANDC2(a,b) ((~(a))&(b))
+#define EQV(a,b) (~((a)^(b)))
+#define NAND(a,b) (~((a)&(b)))
+#define NOR(a,b) (~((a)|(b)))
+#define OR(a,b) ((a)|(b))
+#define ORC1(a,b) ((a)|((~b)))
+#define ORC2(a,b) ((~(a))|(b))
+#define XOR(a,b) ((a)^(b))
+#define TEST1(type, func)  
\
+  type func ## _and_T_ ## type (type a, type b, type c) { return 
AND(func(a,b),c); } \
+  type func ## _andc1_T_   ## type (type a, type b, type c) { return 
ANDC1(func(a,b),c); } \
+  type func ## _andc2_T_   ## type (type a, type b, type c) { return 
ANDC2(func(a,b),c); } \
+  type func ## _eqv_T_ ## type (type a, type b, type c) { return 
EQV(func(a,b),c); } \
+  type func ## _nand_T_## type (type a, type b, type c) { return 
NAND(func(a,b),c); } \
+  type func ## _nor_T_ ## type (type a, type b, type c) { return 
NOR(func(a,b),c); } \
+  type func ## _or_T_  ## type (type a, type b, type c) { return 
OR(func(a,b),c); } \
+  type func ## _orc1_T_## type (type a, type b, type c) { return 
ORC1(func(a,b),c); } \
+  type func ## _orc2_T_## type (type a, type b, type c) { return 
ORC2(func(a,b),c); } \
+  type func ## _xor_T_ ## type (type a, type b, type c) { return 
XOR(func(a,b),c); } \
+  type func ## _rev_and_T_ ## type (type a, type b, type c) { return 
AND(c,func(a,b)); } \
+  type func ## _rev_andc1_T_   ## type (type a, type b, type c) { return 
ANDC1(c,func(a,b)); } \
+  type func ## _rev_andc2_T_   ## type (type a, type b, type c) { return 
ANDC2(c,func(a,b)); } \
+  type func ## _rev_eqv_T_ ## type (type a, type b, type c) { return 
EQV(c,func(a,b)); } \
+  type func ## _rev_nand_T_## type (type a, type b, type c) { return 
NAND(c,func(a,b)); } \
+  type func ## _rev_nor_T_ ## type (type a, type b, type c) { return 
NOR(c,func(a,b)); } \
+  type func ## _rev_or_T_  ## type (type a, type b, type c) { return 
OR(c,func(a,b)); } \
+  type func ## _rev_orc1_T_## type (type a, type b, type c) { return 
ORC1(c,func(a,b)); } \
+  type func ## _rev_orc2_T_## type (type a, type b, type c) { return 
ORC2(c,func(a,b)); } \
+  type func ## _rev_xor_T_ ## type (type a, type b, type c) { return 
XOR(c,func(a,b)); }
+#define TEST(type)\
+  TEST1(type,AND) \
+  TEST1(type,ANDC1)   \
+  TEST1(type,ANDC2)   \
+  TEST1(type,EQV) \
+  TEST1(type,NAND)\
+  TEST1(type,NOR) \
+  TEST1(type,OR)  \
+  TEST1(type,ORC1)\
+  TEST1(type,ORC2)\
+  TEST1(type,XOR)
+
+typedef vector bool char vboolchar_t;
+typedef vector unsigned int vuint_t;
+
+TEST(uint8_t);
+TEST(int8_t);
+TEST(uint16_t);
+TEST(int16_t);
+TEST(uint32_t);
+TEST(int32_t);
+TEST(uint64_t);
+TEST(int64_t);
+TEST(vboolchar_t);
+TEST(vuint_t);
+  
+/* { dg-final { scan-assembler-times "fuse_and_and/0"16 } } */
+/* { dg-final { scan-assembler-times "fuse_and_and/2"16 } } */
+/* { dg-final { scan-assembler-times "fuse_andc_and/0"   48 } } */
+/* { dg-final { scan-assembler-times "fuse_andc_and/1"   16 } } */
+/* { dg-final { scan-assembler-times "fuse_andc_and/2"   26 } } */
+/* { dg-final { scan-assembler-times "fuse_andc_and/3"6 } } */
+/* { dg-final { scan-assembler-times "fuse_andc_or/0"32 } } */
+/* { dg-final { scan-assembler-times "fuse_andc_or/1"16 } } */
+/* { dg-final { scan-assembler-times "fuse_andc_or/2"16 } } */
+/* { dg-final { scan-assembler-times "fuse_andc_orc/0"   48 } } */
+/* { dg-final { scan-assembler-times "fuse_andc_orc/1"8 } } */
+/* { dg-final { scan-assembler-times "fuse_andc_orc/2"8 } } */
+/* { dg-final { scan-assembler-times "fuse_andc_xor/0"   32 } } */
+/* { dg-final { scan-assembler-times "fus

[PATCH,rs6000] Make MMA builtins use opaque modes

2020-11-17 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

This patch changes powerpc MMA builtins to use the new opaque
mode class and use modes OO (32 bytes) and XO (64 bytes)
instead of POI/PXI. Using the opaque modes prevents
optimization from trying to do anything with vector
pair/quad, which was the problem we were seeing with the
partial integer modes.

OK for trunk if bootstrap/regtest passes? 

gcc/
* gcc/config/rs6000/mma.md (unspec):
Add assemble/extract UNSPECs.
(movoi): Change to movoo.
(*movpoi): Change to *movoo.
(movxi): Change to movxo.
(*movpxi): Change to *movxo.
(mma_assemble_pair): Change to OO mode.
(*mma_assemble_pair): New define_insn_and_split.
(mma_disassemble_pair): New define_expand.
(*mma_disassemble_pair): New define_insn_and_split.
(mma_assemble_acc): Change to XO mode.
(*mma_assemble_acc): Change to XO mode.
(mma_disassemble_acc): New define_expand.
(*mma_disassemble_acc): New define_insn_and_split.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to OO mode.
(mma_): Change to XO/OO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO/OO mode.
(mma_): Change to XO/OO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
* gcc/config/rs6000/predicates.md (input_operand): Allow opaque.
(mma_disassemble_output_operand): New predicate.
* gcc/config/rs6000/rs6000-builtin.def:
Changes to disassemble builtins.
* gcc/config/rs6000/rs6000-call.c (rs6000_return_in_memory):
Disallow __vector_pair/__vector_quad as return types.
(rs6000_promote_function_mode): Remove function return type
check because we can't test it here any more.
(rs6000_function_arg): Do not allow __vector_pair/__vector_quad
as as function arguments.
(rs6000_gimple_fold_mma_builtin):
Handle mma_disassemble_* builtins.
(rs6000_init_builtins): Create types for XO/OO modes.
* gcc/config/rs6000/rs6000-modes.def: Create XO and OO modes.
* gcc/config/rs6000/rs6000-string.c (expand_block_move):
Update to OO mode.
* gcc/config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok_uncached):
Update for XO/OO modes.
(rs6000_modes_tieable_p): Update for XO/OO modes.
(rs6000_debug_reg_global): Update for XO/OO modes.
(rs6000_setup_reg_addr_masks): Update for XO/OO modes.
(rs6000_init_hard_regno_mode_ok): Update for XO/OO modes.
(reg_offset_addressing_ok_p): Update for XO/OO modes.
(rs6000_emit_move): Update for XO/OO modes.
(rs6000_preferred_reload_class): Update for XO/OO modes.
(rs6000_split_multireg_move): Update for XO/OO modes.
(rs6000_mangle_type): Update for opaque types.
(rs6000_invalid_conversion): Update for XO/OO modes.
* gcc/config/rs6000/rs6000.h (VECTOR_ALIGNMENT_P):
Update for XO/OO modes.
* gcc/config/rs6000/rs6000.md (RELOAD): Update for XO/OO modes.
gcc/testsuite/
* gcc.target/powerpc/mma-double-test.c (main): Call abort for failure.
* gcc.target/powerpc/mma-single-test.c (main): Call abort for failure.
* gcc.target/powerpc/pr96506.c: Rename to pr96506-1.c.
* gcc.target/powerpc/pr96506-2.c: New test.
---
 gcc/config/rs6000/mma.md  | 385 ++
 gcc/config/rs6000/predicates.md   |  14 +-
 gcc/config/rs6000/rs6000-builtin.def  |  14 +-
 gcc/config/rs6000/rs6000-call.c   | 144 ---
 gcc/config/rs6000/rs6000-modes.def|  10 +-
 gcc/config/rs6000/rs6000-string.c |   6 +-
 gcc/config/rs6000/rs6000.c| 189 +
 gcc/config/rs6000/rs6000.h|   3 +-
 gcc/config/rs6000/rs6000.md   |   2 +-
 .../gcc.target/powerpc/mma-double-test.c  |   3 +
 .../gcc.target/powerpc/mma-single-test.c  |   3 +
 .../powerpc/{pr96506.c => pr96506-1.c}|  24 --
 gcc/testsuite/gcc.target/powerpc/pr96506-2.c  |  38 ++
 13 files changed, 482 insertions(+), 353 deletions(-)
 rename gcc/testsuite/gcc.target/powerpc/{pr96506.c => pr96506-1.c} (61%)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr96506-2.c

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index a3fd28bdd0a..7d520e19b0d 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -19,24 +19,19 @@
 ;; along with GCC; see the file COPYING3.  If not see
 ;; .
 
-;; The MMA patterns use the multi-register PXImode and POImode partial
+;; The MMA patterns use the multi-register XOmode and OOmode partial
 ;; integer modes to impl

[PATCH, rs6000] Re-enable vector pair memcpy/memmove expansion

2020-11-17 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

After the MMA opaque mode patch goes in, we can re-enable
use of vector pair in the inline expansion of memcpy/memmove.

After bootstrap/regtest, OK for trunk?

Thanks,
Aaron

gcc/
* config/rs6000/rs6000.c (rs6000_option_override_internal):
Enable vector pair memcpy/memmove expansion.
---
 gcc/config/rs6000/rs6000.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index bb48ed92aef..53f92970414 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4117,11 +4117,10 @@ rs6000_option_override_internal (bool global_init_p)
 
   if (!(rs6000_isa_flags_explicit & OPTION_MASK_BLOCK_OPS_VECTOR_PAIR))
 {
-  /* When the POImode issues of PR96791 are resolved, then we can
-once again enable use of vector pair for memcpy/memmove on
-P10 if we have TARGET_MMA.  For now we make it disabled by
-default for all targets.  */
-  rs6000_isa_flags &= ~OPTION_MASK_BLOCK_OPS_VECTOR_PAIR;
+  if (TARGET_MMA && TARGET_EFFICIENT_UNALIGNED_VSX)
+   rs6000_isa_flags |= OPTION_MASK_BLOCK_OPS_VECTOR_PAIR;
+  else
+   rs6000_isa_flags &= ~OPTION_MASK_BLOCK_OPS_VECTOR_PAIR;
 }
 
   /* Use long double size to select the appropriate long double.  We use
-- 
2.18.4



[PATCH] Additional small changes to support opaque modes

2020-11-19 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

After building some larger codes using opaque types and some c++ codes
using opaque types it became clear I needed to go through and look for
places where opaque types and modes needed to be handled. A whole pile
of one-liners.

If bootstrap/regtest passes for ppc64le and x86_64, ok for trunk?

gcc/
* typeclass.h: Add opaque_type_class.
* builtins.c (type_to_class): Identify opaque type class.
* c-family/c-pretty-print.c (c_pretty_printer::simple_type_specifier):
Treat opaque types like other types.
(c_pretty_printer::direct_abstract_declarator): Opaque types are
supported types.
* c/c-aux-info.c (gen_type): Support opaque types.
* cp/error.c (dump_type): Handle opaque types.
(dump_type_prefix): Handle opaque types.
(dump_type_suffix): Handle opaque types.
(dump_expr): Handle opaque types.
* cp/pt.c (tsubst): Allow opaque types in templates.
(unify): Allow opaque types in templates.
* cp/typeck.c (structural_comptypes): Handle comparison
of opaque types.
* dwarf2out.c (is_base_type): Handle opaque types.
(loc_descriptor): Handle opaque modes like VOIDmode/BLKmode.
(gen_type_die_with_usage): Handle opaque types.
* expr.c (count_type_elements): Opaque types should
never have initializers.
* ipa-devirt.c (odr_types_equivalent_p): No type-specific handling
for opaque types is needed as it eventually checks the underlying
mode which is what is important.
* tree-streamer.c (record_common_node): Handle opaque types.
* tree.c (type_contains_placeholder_1): Handle opaque types.
(type_cache_hasher::equal): No additional comparison needed for
opaque types.
---
 gcc/builtins.c| 1 +
 gcc/c-family/c-pretty-print.c | 2 ++
 gcc/c/c-aux-info.c| 4 
 gcc/cp/error.c| 4 
 gcc/cp/pt.c   | 2 ++
 gcc/cp/typeck.c   | 1 +
 gcc/dwarf2out.c   | 4 +++-
 gcc/expr.c| 1 +
 gcc/ipa-devirt.c  | 1 +
 gcc/tree-streamer.c   | 1 +
 gcc/tree.c| 2 ++
 gcc/typeclass.h   | 2 +-
 12 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 42c52a1925e..0958abcae49 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -2228,6 +2228,7 @@ type_to_class (tree type)
 case ARRAY_TYPE:  return (TYPE_STRING_FLAG (type)
   ? string_type_class : array_type_class);
 case LANG_TYPE:   return lang_type_class;
+case OPAQUE_TYPE:  return opaque_type_class;
 default:  return no_type_class;
 }
 }
diff --git a/gcc/c-family/c-pretty-print.c b/gcc/c-family/c-pretty-print.c
index 8953e3b678b..3027703056b 100644
--- a/gcc/c-family/c-pretty-print.c
+++ b/gcc/c-family/c-pretty-print.c
@@ -342,6 +342,7 @@ c_pretty_printer::simple_type_specifier (tree t)
   break;
 
 case VOID_TYPE:
+case OPAQUE_TYPE:
 case BOOLEAN_TYPE:
 case INTEGER_TYPE:
 case REAL_TYPE:
@@ -662,6 +663,7 @@ c_pretty_printer::direct_abstract_declarator (tree t)
 
 case IDENTIFIER_NODE:
 case VOID_TYPE:
+case OPAQUE_TYPE:
 case BOOLEAN_TYPE:
 case INTEGER_TYPE:
 case REAL_TYPE:
diff --git a/gcc/c/c-aux-info.c b/gcc/c/c-aux-info.c
index ffc8099856d..41f5598de38 100644
--- a/gcc/c/c-aux-info.c
+++ b/gcc/c/c-aux-info.c
@@ -413,6 +413,10 @@ gen_type (const char *ret_val, tree t, formals_style style)
  data_type = IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (t)));
  break;
 
+   case OPAQUE_TYPE:
+ data_type = IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (t)));
+ break;
+
case VOID_TYPE:
  data_type = "void";
  break;
diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 396558be17f..d27545d1223 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -529,6 +529,7 @@ dump_type (cxx_pretty_printer *pp, tree t, int flags)
 case INTEGER_TYPE:
 case REAL_TYPE:
 case VOID_TYPE:
+case OPAQUE_TYPE:
 case BOOLEAN_TYPE:
 case COMPLEX_TYPE:
 case VECTOR_TYPE:
@@ -874,6 +875,7 @@ dump_type_prefix (cxx_pretty_printer *pp, tree t, int flags)
 case UNION_TYPE:
 case LANG_TYPE:
 case VOID_TYPE:
+case OPAQUE_TYPE:
 case TYPENAME_TYPE:
 case COMPLEX_TYPE:
 case VECTOR_TYPE:
@@ -997,6 +999,7 @@ dump_type_suffix (cxx_pretty_printer *pp, tree t, int flags)
 case UNION_TYPE:
 case LANG_TYPE:
 case VOID_TYPE:
+case OPAQUE_TYPE:
 case TYPENAME_TYPE:
 case COMPLEX_TYPE:
 case VECTOR_TYPE:
@@ -2810,6 +2813,7 @@ dump_expr (cxx_pretty_printer *pp, tree t, int flags)
 case ENUMERAL_TYPE:
 case REAL_TYPE:
 case VOID_TYPE:
+case OPAQUE_TYPE:
 case BOOLEAN_TYPE:
 case INTEGER_TYPE:
 case COMPLEX_TYPE:
diff --git a/gcc/cp/pt.c b/gcc/

[PATCH] Additional small changes to support opaque modes

2020-11-19 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

After building some larger codes using opaque types and some c++ codes
using opaque types it became clear I needed to go through and look for
places where opaque types and modes needed to be handled. A whole pile
of one-liners.

If bootstrap/regtest passes for ppc64le and x86_64, ok for trunk?

gcc/
* typeclass.h: Add opaque_type_class.
* builtins.c (type_to_class): Identify opaque type class.
* c-family/c-pretty-print.c (c_pretty_printer::simple_type_specifier):
Treat opaque types like other types.
(c_pretty_printer::direct_abstract_declarator): Opaque types are
supported types.
* c/c-aux-info.c (gen_type): Support opaque types.
* cp/error.c (dump_type): Handle opaque types.
(dump_type_prefix): Handle opaque types.
(dump_type_suffix): Handle opaque types.
(dump_expr): Handle opaque types.
* cp/pt.c (tsubst): Allow opaque types in templates.
(unify): Allow opaque types in templates.
* cp/typeck.c (structural_comptypes): Handle comparison
of opaque types.
* dwarf2out.c (is_base_type): Handle opaque types.
(loc_descriptor): Handle opaque modes like VOIDmode/BLKmode.
(gen_type_die_with_usage): Handle opaque types.
* expr.c (count_type_elements): Opaque types should
never have initializers.
* ipa-devirt.c (odr_types_equivalent_p): No type-specific handling
for opaque types is needed as it eventually checks the underlying
mode which is what is important.
* tree-streamer.c (record_common_node): Handle opaque types.
* tree.c (type_contains_placeholder_1): Handle opaque types.
(type_cache_hasher::equal): No additional comparison needed for
opaque types.
---
 gcc/builtins.c| 1 +
 gcc/c-family/c-pretty-print.c | 2 ++
 gcc/c/c-aux-info.c| 4 
 gcc/cp/error.c| 4 
 gcc/cp/pt.c   | 2 ++
 gcc/cp/typeck.c   | 1 +
 gcc/dwarf2out.c   | 4 +++-
 gcc/expr.c| 1 +
 gcc/ipa-devirt.c  | 1 +
 gcc/tree-streamer.c   | 1 +
 gcc/tree.c| 2 ++
 gcc/typeclass.h   | 2 +-
 12 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 42c52a1925e..0958abcae49 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -2228,6 +2228,7 @@ type_to_class (tree type)
 case ARRAY_TYPE:  return (TYPE_STRING_FLAG (type)
   ? string_type_class : array_type_class);
 case LANG_TYPE:   return lang_type_class;
+case OPAQUE_TYPE:  return opaque_type_class;
 default:  return no_type_class;
 }
 }
diff --git a/gcc/c-family/c-pretty-print.c b/gcc/c-family/c-pretty-print.c
index 8953e3b678b..3027703056b 100644
--- a/gcc/c-family/c-pretty-print.c
+++ b/gcc/c-family/c-pretty-print.c
@@ -342,6 +342,7 @@ c_pretty_printer::simple_type_specifier (tree t)
   break;
 
 case VOID_TYPE:
+case OPAQUE_TYPE:
 case BOOLEAN_TYPE:
 case INTEGER_TYPE:
 case REAL_TYPE:
@@ -662,6 +663,7 @@ c_pretty_printer::direct_abstract_declarator (tree t)
 
 case IDENTIFIER_NODE:
 case VOID_TYPE:
+case OPAQUE_TYPE:
 case BOOLEAN_TYPE:
 case INTEGER_TYPE:
 case REAL_TYPE:
diff --git a/gcc/c/c-aux-info.c b/gcc/c/c-aux-info.c
index ffc8099856d..41f5598de38 100644
--- a/gcc/c/c-aux-info.c
+++ b/gcc/c/c-aux-info.c
@@ -413,6 +413,10 @@ gen_type (const char *ret_val, tree t, formals_style style)
  data_type = IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (t)));
  break;
 
+   case OPAQUE_TYPE:
+ data_type = IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (t)));
+ break;
+
case VOID_TYPE:
  data_type = "void";
  break;
diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 396558be17f..d27545d1223 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -529,6 +529,7 @@ dump_type (cxx_pretty_printer *pp, tree t, int flags)
 case INTEGER_TYPE:
 case REAL_TYPE:
 case VOID_TYPE:
+case OPAQUE_TYPE:
 case BOOLEAN_TYPE:
 case COMPLEX_TYPE:
 case VECTOR_TYPE:
@@ -874,6 +875,7 @@ dump_type_prefix (cxx_pretty_printer *pp, tree t, int flags)
 case UNION_TYPE:
 case LANG_TYPE:
 case VOID_TYPE:
+case OPAQUE_TYPE:
 case TYPENAME_TYPE:
 case COMPLEX_TYPE:
 case VECTOR_TYPE:
@@ -997,6 +999,7 @@ dump_type_suffix (cxx_pretty_printer *pp, tree t, int flags)
 case UNION_TYPE:
 case LANG_TYPE:
 case VOID_TYPE:
+case OPAQUE_TYPE:
 case TYPENAME_TYPE:
 case COMPLEX_TYPE:
 case VECTOR_TYPE:
@@ -2810,6 +2813,7 @@ dump_expr (cxx_pretty_printer *pp, tree t, int flags)
 case ENUMERAL_TYPE:
 case REAL_TYPE:
 case VOID_TYPE:
+case OPAQUE_TYPE:
 case BOOLEAN_TYPE:
 case INTEGER_TYPE:
 case COMPLEX_TYPE:
diff --git a/gcc/cp/pt.c b/gcc/

[PATCH,rs6000] Make MMA builtins use opaque modes [v2]

2020-11-19 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

Segher & Bergner -
  Thanks for the reviews, here's the updated patch after fixing those things.
We now have an UNSPEC for xxsetaccz, and an accompanying change to
rs6000_rtx_costs to make it be cost 0 so that CSE doesn't try to replace it
with a bunch of register moves.

If bootstrap/regtest looks good, ok for trunk?

Thanks,
Aaron

gcc/
* gcc/config/rs6000/mma.md (unspec): Add assemble/extract UNSPECs.
(movoi): Change to movoo.
(*movpoi): Change to *movoo.
(movxi): Change to movxo.
(*movpxi): Change to *movxo.
(mma_assemble_pair): Change to OO mode.
(*mma_assemble_pair): New define_insn_and_split.
(mma_disassemble_pair): New define_expand.
(*mma_disassemble_pair): New define_insn_and_split.
(mma_assemble_acc): Change to XO mode.
(*mma_assemble_acc): Change to XO mode.
(mma_disassemble_acc): New define_expand.
(*mma_disassemble_acc): New define_insn_and_split.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to OO mode.
(mma_): Change to XO/OO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO/OO mode.
(mma_): Change to XO/OO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
* gcc/config/rs6000/predicates.md (input_operand): Allow opaque.
(mma_disassemble_output_operand): New predicate.
* gcc/config/rs6000/rs6000-builtin.def:
Changes to disassemble builtins.
* gcc/config/rs6000/rs6000-call.c (rs6000_return_in_memory):
Disallow __vector_pair/__vector_quad as return types.
(rs6000_promote_function_mode): Remove function return type
check because we can't test it here any more.
(rs6000_function_arg): Do not allow __vector_pair/__vector_quad
as as function arguments.
(rs6000_gimple_fold_mma_builtin):
Handle mma_disassemble_* builtins.
(rs6000_init_builtins): Create types for XO/OO modes.
* gcc/config/rs6000/rs6000-modes.def: DElete OI, XI,
POI, and PXI modes, and create XO and OO modes.
* gcc/config/rs6000/rs6000-string.c (expand_block_move):
Update to OO mode.
* gcc/config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok_uncached):
Update for XO/OO modes.
(rs6000_rtx_costs): Make UNSPEC_MMA_XXSETACCZ cost 0.
(rs6000_modes_tieable_p): Update for XO/OO modes.
(rs6000_debug_reg_global): Update for XO/OO modes.
(rs6000_setup_reg_addr_masks): Update for XO/OO modes.
(rs6000_init_hard_regno_mode_ok): Update for XO/OO modes.
(reg_offset_addressing_ok_p): Update for XO/OO modes.
(rs6000_emit_move): Update for XO/OO modes.
(rs6000_preferred_reload_class): Update for XO/OO modes.
(rs6000_split_multireg_move): Update for XO/OO modes.
(rs6000_mangle_type): Update for opaque types.
(rs6000_invalid_conversion): Update for XO/OO modes.
* gcc/config/rs6000/rs6000.h (VECTOR_ALIGNMENT_P):
Update for XO/OO modes.
* gcc/config/rs6000/rs6000.md (RELOAD): Update for XO/OO modes.
gcc/testsuite/
* gcc.target/powerpc/mma-double-test.c (main): Call abort for failure.
* gcc.target/powerpc/mma-single-test.c (main): Call abort for failure.
* gcc.target/powerpc/pr96506.c: Rename to pr96506-1.c.
* gcc.target/powerpc/pr96506-2.c: New test.
---
 gcc/config/rs6000/mma.md  | 421 ++
 gcc/config/rs6000/predicates.md   |  12 +
 gcc/config/rs6000/rs6000-builtin.def  |  14 +-
 gcc/config/rs6000/rs6000-call.c   | 142 +++---
 gcc/config/rs6000/rs6000-modes.def|  10 +-
 gcc/config/rs6000/rs6000-string.c |   6 +-
 gcc/config/rs6000/rs6000.c| 193 
 gcc/config/rs6000/rs6000.h|   3 +-
 gcc/config/rs6000/rs6000.md   |   2 +-
 .../gcc.target/powerpc/mma-double-test.c  |   3 +
 .../gcc.target/powerpc/mma-single-test.c  |   3 +
 .../powerpc/{pr96506.c => pr96506-1.c}|  24 -
 gcc/testsuite/gcc.target/powerpc/pr96506-2.c  |  38 ++
 13 files changed, 508 insertions(+), 363 deletions(-)
 rename gcc/testsuite/gcc.target/powerpc/{pr96506.c => pr96506-1.c} (61%)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr96506-2.c

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index a3fd28bdd0a..63bb73a01e7 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -19,24 +19,18 @@
 ;; along with GCC; see the file COPYING3.  If not see
 ;; .
 
-;; The MMA patterns use the multi-register PXImode and POImode partial
-;; in

[PATCH,rs6000] Combine patterns for p10 load-cmpi fusion

2020-12-04 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

This patch adds the first batch of patterns to support p10 fusion. These
will allow combine to create a single insn for a pair of instructions
that that power10 can fuse and execute. These particular ones have the
requirement that only cr0 can be used when fusing a load with a compare
immediate of -1/0/1 (if signed) or 0/1 (if unsigned), so we want combine
to put that requirement in, and if it doesn't work out later the splitter
can get used.

The patterns are generated by a script genfusion.pl and live in new file
fusion.md. This script will be expanded to generate more patterns for
fusion.

This also adds option -mpower10-fusion which defaults on for power10 and
will gate all these fusion patterns. In addition I have added an
undocumented option -mpower10-fusion-ld-cmpi (which may be removed later)
that just controls the load+compare-immediate patterns. I have make
these default on for power10 but they are not disallowed for earlier
processors because it is still valid code. This allows us to test the
correctness of fusion code generation by turning it on explicitly.

If bootstrap/regtest is clean, ok for trunk?

Thanks!

   Aaron

gcc/ChangeLog:

* config/rs6000/genfusion.pl: New file, script to generate
define_insn_and_split patterns so combine can arrange fused
instructions next to each other.
* config/rs6000/fusion.md: New file, generated fused instruction
patterns for combine.
* config/rs6000/predicates.md (const_m1_to_1_operand): New predicate.
(non_update_memory_operand): New predicate.
* config/rs6000/rs6000-cpus.def: Add OPTION_MASK_P10_FUSION and
OPTION_MASK_P10_FUSION_LD_CMPI to ISA_3_1_MASKS_SERVER and
POWERPC_MASKS.
* config/rs6000/rs6000-protos.h (address_is_non_pfx_d_or_x): Add
prototype.
* config/rs6000/rs6000.c (rs6000_option_override_internal):
automatically set -mpower10-fusion and -mpower10-fusion-ld-cmpi
if target is power10.  (rs600_opt_masks): Allow -mpower10-fusion
in function attributes.  (address_is_non_pfx_d_or_x): New function.
* config/rs6000/rs6000.h: Add MASK_P10_FUSION.
* config/rs6000/rs6000.md: Include fusion.md.
* config/rs6000/rs6000.opt: Add -mpower10-fusion
and -mpower10-fusion-ld-cmpi.
* config/rs6000/t-rs6000: Add dependencies involving fusion.md.
---
 gcc/config/rs6000/fusion.md   | 357 ++
 gcc/config/rs6000/genfusion.pl| 144 
 gcc/config/rs6000/predicates.md   |  14 ++
 gcc/config/rs6000/rs6000-cpus.def |   6 +-
 gcc/config/rs6000/rs6000-protos.h |   2 +
 gcc/config/rs6000/rs6000.c|  51 +
 gcc/config/rs6000/rs6000.h|   1 +
 gcc/config/rs6000/rs6000.md   |   1 +
 gcc/config/rs6000/rs6000.opt  |   8 +
 gcc/config/rs6000/t-rs6000|   6 +-
 10 files changed, 588 insertions(+), 2 deletions(-)
 create mode 100644 gcc/config/rs6000/fusion.md
 create mode 100755 gcc/config/rs6000/genfusion.pl

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
new file mode 100644
index 000..a4d3a6ae7f3
--- /dev/null
+++ b/gcc/config/rs6000/fusion.md
@@ -0,0 +1,357 @@
+;; -*- buffer-read-only: t -*-
+;; Generated automatically by genfusion.pl
+
+;; Copyright (C) 2020 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it under
+;; the terms of the GNU General Public License as published by the Free
+;; Software Foundation; either version 3, or (at your option) any later
+;; version.
+;;
+;; GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+;; WARRANTY; without even the implied warranty of MERCHANTABILITY or
+;; FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+;; for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
+;; load mode is DI result mode is clobber compare mode is CC extend is none
+(define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
+  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
+(compare:CC (match_operand:DI 1 "non_update_memory_operand" "m")
+ (match_operand:DI 3 "const_m1_to_1_operand" "n")))
+   (clobber (match_scratch:DI 0 "=r"))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
+  "ld%X1 %0,%1\;cmpdi 0,%0,%3"
+  "&& reload_completed
+   && (cc_reg_not_cr0_operand (operands[2], CCmode)
+   || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, 
NON_PREFIXED_DS))"
+  [(set (match_dup 0) (match_dup 1))
+   (set (match_dup 2)
+(compare:CC (match_dup 0)
+   (match_dup 3)))]
+  ""
+  [(set_attr "type" "load")
+   (set_attr "cost" "8")
+   (set_attr "length" "8")])
+
+;; load-cmpi fusion pattern generate

[PATCH,rs6000] Optimize pcrel access of globals [ping]

2020-12-09 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

Ping. I've folded in the changes to comments suggested by Will Schmidt.

This patch implements a RTL pass that looks for pc-relative loads of the
address of an external variable using the PCREL_GOT relocation and a
single load or store that uses that external address.

Produced by a cast of thousands:
 * Michael Meissner
 * Peter Bergner
 * Bill Schmidt
 * Alan Modra
 * Segher Boessenkool
 * Aaron Sawdey

Passes bootstrap/regtest on ppc64le power10. Should have no effect on
other processors. OK for trunk?

Thanks!
   Aaron

gcc/ChangeLog:

* config.gcc: Add pcrel-opt.c and pcrel-opt.o.
* config/rs6000/pcrel-opt.c: New file.
* config/rs6000/pcrel-opt.md: New file.
* config/rs6000/predicates.md: Add d_form_memory predicate.
* config/rs6000/rs6000-cpus.def: Add OPTION_MASK_PCREL_OPT.
* config/rs6000/rs6000-passes.def: Add pass_pcrel_opt.
* config/rs6000/rs6000-protos.h: Add reg_to_non_prefixed(),
offsettable_non_prefixed_memory(), output_pcrel_opt_reloc(),
and make_pass_pcrel_opt().
* config/rs6000/rs6000.c (reg_to_non_prefixed): Make global.
(rs6000_option_override_internal): Add pcrel-opt.
(rs6000_delegitimize_address): Support pcrel-opt.
(rs6000_opt_masks): Add pcrel-opt.
(offsettable_non_prefixed_memory): New function.
(reg_to_non_prefixed): Make global.
(rs6000_asm_output_opcode): Reset next_insn_prefixed_p.
(output_pcrel_opt_reloc): New function.
* config/rs6000/rs6000.md (loads_extern_addr): New attr.
(pcrel_extern_addr): Set loads_extern_addr.
Add include for pcrel-opt.md.
* config/rs6000/rs6000.opt: Add -mpcrel-opt.
* config/rs6000/t-rs6000: Add rules for pcrel-opt.c and
pcrel-opt.md.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pcrel-opt-inc-di.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-df.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-di.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-hi.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-qi.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-sf.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-si.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-vector.c: New test.
* gcc.target/powerpc/pcrel-opt-st-df.c: New test.
* gcc.target/powerpc/pcrel-opt-st-di.c: New test.
* gcc.target/powerpc/pcrel-opt-st-hi.c: New test.
* gcc.target/powerpc/pcrel-opt-st-qi.c: New test.
* gcc.target/powerpc/pcrel-opt-st-sf.c: New test.
* gcc.target/powerpc/pcrel-opt-st-si.c: New test.
* gcc.target/powerpc/pcrel-opt-st-vector.c: New test.
---
 gcc/config.gcc|   6 +-
 gcc/config/rs6000/pcrel-opt.c | 888 ++
 gcc/config/rs6000/pcrel-opt.md| 386 
 gcc/config/rs6000/predicates.md   |  23 +
 gcc/config/rs6000/rs6000-cpus.def |   2 +
 gcc/config/rs6000/rs6000-passes.def   |   8 +
 gcc/config/rs6000/rs6000-protos.h |   4 +
 gcc/config/rs6000/rs6000.c| 116 ++-
 gcc/config/rs6000/rs6000.md   |   8 +-
 gcc/config/rs6000/rs6000.opt  |   4 +
 gcc/config/rs6000/t-rs6000|   7 +-
 .../gcc.target/powerpc/pcrel-opt-inc-di.c |  18 +
 .../gcc.target/powerpc/pcrel-opt-ld-df.c  |  36 +
 .../gcc.target/powerpc/pcrel-opt-ld-di.c  |  43 +
 .../gcc.target/powerpc/pcrel-opt-ld-hi.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-ld-qi.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-ld-sf.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-ld-si.c  |  41 +
 .../gcc.target/powerpc/pcrel-opt-ld-vector.c  |  36 +
 .../gcc.target/powerpc/pcrel-opt-st-df.c  |  36 +
 .../gcc.target/powerpc/pcrel-opt-st-di.c  |  37 +
 .../gcc.target/powerpc/pcrel-opt-st-hi.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-st-qi.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-st-sf.c  |  36 +
 .../gcc.target/powerpc/pcrel-opt-st-si.c  |  41 +
 .../gcc.target/powerpc/pcrel-opt-st-vector.c  |  36 +
 26 files changed, 2013 insertions(+), 9 deletions(-)
 create mode 100644 gcc/config/rs6000/pcrel-opt.c
 create mode 100644 gcc/config/rs6000/pcrel-opt.md
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-inc-di.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-df.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-di.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-hi.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-qi.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-sf.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-si.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-vector.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-df.c
 create 

[PATCH, rs6000] Optimize pcrel access of globals

2020-10-20 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

This patch implements a RTL pass that looks for pc-relative loads of the
address of an external variable using the PCREL_GOT relocation and a
single load or store that uses that external address. It then uses the
PCREL_OPT relocation to convert that first load into a single pc-relative
load or store to directly access that external variable.

Produced by a cast of thousands:
 * Michael Meissner
 * Peter Bergner
 * Bill Schmidt
 * Alan Modra
 * Segher Boessenkool
 * Aaron Sawdey

Passes bootstrap/regtest on ppc64le power10. OK for trunk?

gcc/ChangeLog:

* config.gcc: Add pcrel-opt.o.
* config/rs6000/pcrel-opt.c: New file.
* config/rs6000/pcrel-opt.md: New file.
* config/rs6000/predicates.md: Add d_form_memory predicate.
* config/rs6000/rs6000-cpus.def: Add OPTION_MASK_PCREL_OPT.
* config/rs6000/rs6000-passes.def: Add pass_pcrel_opt.
* config/rs6000/rs6000-protos.h: Add reg_to_non_prefixed(),
offsettable_non_prefixed_memory(), output_pcrel_opt_reloc(),
and make_pass_pcrel_opt().
* config/rs6000/rs6000.c (reg_to_non_prefixed): Make global.
(rs6000_option_override_internal): Add pcrel-opt.
(rs6000_delegitimize_address): Support pcrel-opt.
(rs6000_opt_masks): Add pcrel-opt.
(offsettable_non_prefixed_memory): New function.
(reg_to_non_prefixed): Make global.
(rs6000_asm_output_opcode): Reset next_insn_prefixed_p.
(output_pcrel_opt_reloc): New function.
* config/rs6000/rs6000.md (loads_extern_addr): New attr.
(pcrel_extern_addr): Set loads_extern_addr.
Add include for pcrel-opt.md.
* config/rs6000/rs6000.opt: Add -mpcrel-opt.
* config/rs6000/t-rs6000: Add rules for pcrel-opt.c and
pcrel-opt.md.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pcrel-opt-inc-di.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-df.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-di.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-hi.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-qi.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-sf.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-si.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-vector.c: New test.
* gcc.target/powerpc/pcrel-opt-st-df.c: New test.
* gcc.target/powerpc/pcrel-opt-st-di.c: New test.
* gcc.target/powerpc/pcrel-opt-st-hi.c: New test.
* gcc.target/powerpc/pcrel-opt-st-qi.c: New test.
* gcc.target/powerpc/pcrel-opt-st-sf.c: New test.
* gcc.target/powerpc/pcrel-opt-st-si.c: New test.
* gcc.target/powerpc/pcrel-opt-st-vector.c: New test.
---
 gcc/config.gcc|   6 +-
 gcc/config/rs6000/pcrel-opt.c | 887 ++
 gcc/config/rs6000/pcrel-opt.md| 386 
 gcc/config/rs6000/predicates.md   |  23 +
 gcc/config/rs6000/rs6000-cpus.def |   2 +
 gcc/config/rs6000/rs6000-passes.def   |   8 +
 gcc/config/rs6000/rs6000-protos.h |   4 +
 gcc/config/rs6000/rs6000.c| 116 ++-
 gcc/config/rs6000/rs6000.md   |   8 +-
 gcc/config/rs6000/rs6000.opt  |   4 +
 gcc/config/rs6000/t-rs6000|   7 +-
 .../gcc.target/powerpc/pcrel-opt-inc-di.c |  18 +
 .../gcc.target/powerpc/pcrel-opt-ld-df.c  |  36 +
 .../gcc.target/powerpc/pcrel-opt-ld-di.c  |  43 +
 .../gcc.target/powerpc/pcrel-opt-ld-hi.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-ld-qi.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-ld-sf.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-ld-si.c  |  41 +
 .../gcc.target/powerpc/pcrel-opt-ld-vector.c  |  36 +
 .../gcc.target/powerpc/pcrel-opt-st-df.c  |  36 +
 .../gcc.target/powerpc/pcrel-opt-st-di.c  |  37 +
 .../gcc.target/powerpc/pcrel-opt-st-hi.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-st-qi.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-st-sf.c  |  36 +
 .../gcc.target/powerpc/pcrel-opt-st-si.c  |  41 +
 .../gcc.target/powerpc/pcrel-opt-st-vector.c  |  36 +
 26 files changed, 2012 insertions(+), 9 deletions(-)
 create mode 100644 gcc/config/rs6000/pcrel-opt.c
 create mode 100644 gcc/config/rs6000/pcrel-opt.md
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-inc-di.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-df.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-di.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-hi.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-qi.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-sf.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-si.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-vector.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-df.c
 create mode 10064

[PATCH,rs6000] Add patterns for combine to support p10 fusion

2020-10-26 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

This patch adds the first couple patterns to support p10 fusion. These
will allow combine to create a single insn for a pair of instructions
that that power10 can fuse and execute. These particular ones have the
requirement that only cr0 can be used when fusing a load with a compare
immediate of -1/0/1, so we want combine to put that requirement in, and
if it doesn't work out later the splitter can get used.

This also adds option -mpower10-fusion which defaults on for power10 and
will gate all these fusion patterns. In addition I have added an
undocumented option -mpower10-fusion-ld-cmpi (which may be removed later)
that just controls the load+compare-immediate patterns. I have make
these default on for power10 but they are not disallowed for earlier
processors because it is still valid code. This allows us to test the
correctness of fusion code generation by turning it on explicitly.

The intention is to work through more patterns of this style to support
the rest of the power10 fusion pairs.

Bootstrap and regtest looks good on ppc64le power9 with these patterns
enabled in stage2/stage3 and for regtest. Ok for trunk?

gcc/ChangeLog:

* config/rs6000/predicates.md: Add const_me_to_1_operand.
* config/rs6000/rs6000-cpus.def: Add OPTION_MASK_P10_FUSION and
OPTION_MASK_P10_FUSION_LD_CMPI to ISA_3_1_MASKS_SERVER.
* config/rs6000/rs6000-protos.h (address_ok_for_form): Add
prototype.
* config/rs6000/rs6000.c (rs6000_option_override_internal):
automatically set -mpower10-fusion and -mpower10-fusion-ld-cmpi
if target is power10.  (rs600_opt_masks): Allow -mpower10-fusion
in function attributes.  (address_ok_for_form): New function.
* config/rs6000/rs6000.h: Add MASK_P10_FUSION.
* config/rs6000/rs6000.md (*ld_cmpi_cr0): New
define_insn_and_split.
(*lwa_cmpdi_cr0): New define_insn_and_split.
(*lwa_cmpwi_cr0): New define_insn_and_split.
* config/rs6000/rs6000.opt: Add -mpower10-fusion
and -mpower10-fusion-ld-cmpi.
---
 gcc/config/rs6000/predicates.md   |  5 +++
 gcc/config/rs6000/rs6000-cpus.def |  6 ++-
 gcc/config/rs6000/rs6000-protos.h |  2 +
 gcc/config/rs6000/rs6000.c| 34 
 gcc/config/rs6000/rs6000.h|  1 +
 gcc/config/rs6000/rs6000.md   | 68 +++
 gcc/config/rs6000/rs6000.opt  |  8 
 7 files changed, 123 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 4c2fe7fa312..b75c1ddfb69 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -297,6 +297,11 @@ (define_predicate "const_0_to_1_operand"
   (and (match_code "const_int")
(match_test "IN_RANGE (INTVAL (op), 0, 1)")))
 
+;; Match op = -1, op = 0, or op = 1.
+(define_predicate "const_m1_to_1_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), -1, 1)")))
+
 ;; Match op = 0..3.
 (define_predicate "const_0_to_3_operand"
   (and (match_code "const_int")
diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 8d2c1ffd6cf..3e65289d8df 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -82,7 +82,9 @@
 
 #define ISA_3_1_MASKS_SERVER   (ISA_3_0_MASKS_SERVER   \
 | OPTION_MASK_POWER10  \
-| OTHER_POWER10_MASKS)
+| OTHER_POWER10_MASKS  \
+| OPTION_MASK_P10_FUSION   \
+| OPTION_MASK_P10_FUSION_LD_CMPI)
 
 /* Flags that need to be turned off if -mno-power9-vector.  */
 #define OTHER_P9_VECTOR_MASKS  (OPTION_MASK_FLOAT128_HW\
@@ -129,6 +131,8 @@
 | OPTION_MASK_FLOAT128_KEYWORD \
 | OPTION_MASK_FPRND\
 | OPTION_MASK_POWER10  \
+| OPTION_MASK_P10_FUSION   \
+| OPTION_MASK_P10_FUSION_LD_CMPI   \
 | OPTION_MASK_HTM  \
 | OPTION_MASK_ISEL \
 | OPTION_MASK_MFCRF\
diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index 25fa5dd57cd..d8a344245e6 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -190,6 +190,8 @@ enum non_prefixed_form {
 
 extern enum insn_form address_to_insn_form (rtx, machine_mode,
enum non_prefixed_form);
+extern bool address_ok_for_form (rtx, machine_mode,
+enum non_prefixed_form);
 extern bool