mmaplus)] Update ChangeLog.*

Peter Bergner via Gcc-cvs Tue, 10 Dec 2024 13:02:44 -0800

https://gcc.gnu.org/g:93f4f03f35b1c37ee6618d7907e90b66c0bc0c88


commit 93f4f03f35b1c37ee6618d7907e90b66c0bc0c88
Author: Michael Meissner <meiss...@linux.ibm.com>
Date:   Tue Oct 22 17:55:58 2024 -0400

    Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.mmaplus | 814 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 814 insertions(+)

diff --git a/gcc/ChangeLog.mmaplus b/gcc/ChangeLog.mmaplus
index 0872229a6baf..9664bf85d2b8 100644
--- a/gcc/ChangeLog.mmaplus
+++ b/gcc/ChangeLog.mmaplus
@@ -1,3 +1,817 @@
+==================== Branch mmaplus, patch #19 ====================
+
+RFC2655-Add saturating subtract built-ins.
+
+This patch adds support for a saturating subtract built-in function that may be
+added to a future PowerPC processor.  Note, if it is added, the name of the
+built-in function may change before GCC 13 is released.  If the name changes,
+we will submit a patch changing the name.
+
+I also added support for providing dense math built-in functions, even though
+at present, we have not added any new built-in functions for dense math.  It is
+likely we will want to add new dense math built-in functions as the dense math
+support is fleshed out.
+
+The patches have been tested on both little and big endian systems.  Can I 
check
+it into the master branch?
+
+2024-10-22   Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       * config/rs6000/rs6000-builtin.cc (rs6000_invalid_builtin): Add support
+       for flagging invalid use of future built-in functions.
+       (rs6000_builtin_is_supported): Add support for future built-in
+       functions.
+       * config/rs6000/rs6000-builtins.def (__builtin_saturate_subtract32): New
+       built-in function for -mcpu=future.
+       (__builtin_saturate_subtract64): Likewise.
+       * config/rs6000/rs6000-gen-builtins.cc (enum bif_stanza): Add stanzas
+       for -mcpu=future built-ins.
+       (stanza_map): Likewise.
+       (enable_string): Likewise.
+       (struct attrinfo): Likewise.
+       (parse_bif_attrs): Likewise.
+       (write_decls): Likewise.
+       * config/rs6000/rs6000.md (sat_sub<mode>3): Add saturating subtract
+       built-in insn declarations.
+       (sat_sub<mode>3_dot): Likewise.
+       (sat_sub<mode>3_dot2): Likewise.
+       * doc/extend.texi (Future PowerPC built-ins): New section.
+
+gcc/testsuite/
+
+       * gcc.target/powerpc/subfus-1.c: New test.
+       * gcc.target/powerpc/subfus-2.c: Likewise.
+
+==================== Branch mmaplus, patch #18 ====================
+
+RFC2656-Support load/store vector with right length.
+
+This patch adds support for new instructions that may be added to the PowerPC
+architecture in the future to enhance the load and store vector with length
+instructions.
+
+The current instructions (lxvl, lxvll, stxvl, and stxvll) are inconvient to use
+since the count for the number of bytes must be in the top 8 bits of the GPR
+register, instead of the bottom 8 bits.  This meant that code generating these
+instructions typically had to do a shift left by 56 bits to get the count into
+the right position.  In a future version of the PowerPC architecture, new
+variants of these instructions might be added that expect the count to be in
+the bottom 8 bits of the GPR register.  These patches add this support to GCC
+if the user uses the -mcpu=future option.
+
+I discovered that the code in rs6000-string.cc to generate ISA 3.1 lxvl/stxvl
+future lxvll/stxvll instructions would generate these instructions on 32-bit.
+However the patterns for these instructions is only done on 64-bit systems.  So
+I added a check for 64-bit support before generating the instructions.
+
+The patches have been tested on both little and big endian systems.  Can I 
check
+it into the master branch?
+
+2024-10-22   Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       * config/rs6000/rs6000-string.cc (expand_block_move): Do not generate
+       lxvl and stxvl on 32-bit.
+       * config/rs6000/vsx.md (lxvl): If -mcpu=future, generate the lxvl with
+       the shift count automaticaly used in the insn.
+       (lxvrl): New insn for -mcpu=future.
+       (lxvrll): Likewise.
+       (stxvl): If -mcpu=future, generate the stxvl with the shift count
+       automaticaly used in the insn.
+       (stxvrl): New insn for -mcpu=future.
+       (stxvrll): Likewise.
+
+gcc/testsuite/
+
+       * gcc.target/powerpc/lxvrl.c: New test.
+       * lib/target-supports.exp (check_effective_target_powerpc_future_ok):
+       New effective target.
+
+==================== Branch mmaplus, patch #17 ====================
+
+RFC2653-PowerPC: Add support for 1,024 bit DMR registers.
+
+This patch is a prelimianry patch to add the full 1,024 bit dense math register
+(DMRs) for -mcpu=future.  The MMA 512-bit accumulators map onto the top of the
+DMR register.
+
+This patch only adds the new 1,024 bit register support.  It does not add
+support for any instructions that need 1,024 bit registers instead of 512 bit
+registers.
+
+I used the new mode 'TDOmode' to be the opaque mode used for 1,024 bit
+registers.  The 'wD' constraint added in previous patches is used for these
+registers.  I added support to do load and store of DMRs via the VSX registers,
+since there are no load/store dense math instructions.  I added the new keyword
+'__dmr' to create 1,024 bit types that can be loaded into DMRs.  At present, I
+don't have aliases for __dmr512 and __dmr1024 that we've discussed internally.
+
+The patches have been tested on both little and big endian systems.  Can I 
check
+it into the master branch?
+
+2024-10-22   Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       * config/rs6000/mma.md (UNSPEC_DM_INSERT512_UPPER): New unspec.
+       (UNSPEC_DM_INSERT512_LOWER): Likewise.
+       (UNSPEC_DM_EXTRACT512): Likewise.
+       (UNSPEC_DMR_RELOAD_FROM_MEMORY): Likewise.
+       (UNSPEC_DMR_RELOAD_TO_MEMORY): Likewise.
+       (movtdo): New define_expand and define_insn_and_split to implement 1,024
+       bit DMR registers.
+       (movtdo_insert512_upper): New insn.
+       (movtdo_insert512_lower): Likewise.
+       (movtdo_extract512): Likewise.
+       (reload_dmr_from_memory): Likewise.
+       (reload_dmr_to_memory): Likewise.
+       * config/rs6000/rs6000-builtin.cc (rs6000_type_string): Add DMR
+       support.
+       (rs6000_init_builtins): Add support for __dmr keyword.
+       * config/rs6000/rs6000-call.cc (rs6000_return_in_memory): Add support
+       for TDOmode.
+       (rs6000_function_arg): Likewise.
+       * config/rs6000/rs6000-modes.def (TDOmode): New mode.
+       * config/rs6000/rs6000.cc (rs6000_hard_regno_nregs_internal): Add
+       support for TDOmode.
+       (rs6000_hard_regno_mode_ok_uncached): Likewise.
+       (rs6000_hard_regno_mode_ok): Likewise.
+       (rs6000_modes_tieable_p): Likewise.
+       (rs6000_debug_reg_global): Likewise.
+       (rs6000_setup_reg_addr_masks): Likewise.
+       (rs6000_init_hard_regno_mode_ok): Add support for TDOmode.  Setup reload
+       hooks for DMR mode.
+       (reg_offset_addressing_ok_p): Add support for TDOmode.
+       (rs6000_emit_move): Likewise.
+       (rs6000_secondary_reload_simple_move): Likewise.
+       (rs6000_preferred_reload_class): Likewise.
+       (rs6000_secondary_reload_class): Likewise.
+       (rs6000_mangle_type): Add mangling for __dmr type.
+       (rs6000_dmr_register_move_cost): Add support for TDOmode.
+       (rs6000_split_multireg_move): Likewise.
+       (rs6000_invalid_conversion): Likewise.
+       * config/rs6000/rs6000.h (VECTOR_ALIGNMENT_P): Add TDOmode.
+       (enum rs6000_builtin_type_index): Add DMR type nodes.
+       (dmr_type_node): Likewise.
+       (ptr_dmr_type_node): Likewise.
+
+gcc/testsuite/
+
+       * gcc.target/powerpc/dm-1024bit.c: New test.
+
+==================== Branch mmaplus, patch #16 ====================
+
+RFC2653-Add dense math test for new instruction names.
+
+2024-10-22   Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/testsuite/
+
+       * gcc.target/powerpc/dm-double-test.c: New test.
+       * lib/target-supports.exp (check_effective_target_ppc_dmr_ok): New
+       target test.
+
+==================== Branch mmaplus, patch #15 ====================
+
+RFC2653-PowerPC: Switch to dense math names for all MMA operations.
+
+This patch changes the assembler instruction names for MMA instructions from
+the original name used in power10 to the new name when used with the dense math
+system.  I.e. xvf64gerpp becomes dmxvf64gerpp.  The assembler will emit the
+same bits for either spelling.
+
+For the non-prefixed MMA instructions, we add a 'dm' prefix in front of the
+instruction.  However, the prefixed instructions have a 'pm' prefix, and we add
+the 'dm' prefix afterwards.  To prevent having two sets of parallel int
+attributes, we remove the "pm" prefix from the instruction string in the
+attributes, and add it later, both in the insn name and in the output template.
+
+2024-10-22   Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       * config/rs6000/mma.md (vvi4i4i8): Change the instruction to not have a
+       "pm" prefix.
+       (avvi4i4i8): Likewise.
+       (vvi4i4i2): Likewise.
+       (avvi4i4i2): Likewise.
+       (vvi4i4): Likewise.
+       (avvi4i4): Likewise.
+       (pvi4i2): Likewise.
+       (apvi4i2): Likewise.
+       (vvi4i4i4): Likewise.
+       (avvi4i4i4): Likewise.
+       (mma_<vv>): Add support for running on DMF systems, generating the dense
+       math instruction and using the dense math accumulators.
+       (mma_<pv>): Likewise.
+       (mma_<avv>): Likewise.
+       (mma_<apv>): Likewise.
+       (mma_pm<vvi4i4i8>): Add support for running on DMF systems, generating
+       the dense math instruction and using the dense math accumulators.
+       Rename the insn with a 'pm' prefix and add either 'pm' or 'pmdm'
+       prefixes based on whether we have the original MMA specification or if
+       we have dense math support.
+       (mma_pm<avvi4i4i8>): Likewise.
+       (mma_pm<vvi4i4i2>): Likewise.
+       (mma_pm<avvi4i4i2>): Likewise.
+       (mma_pm<vvi4i4>): Likewise.
+       (mma_pm<avvi4i4): Likewise.
+       (mma_pm<pvi4i2>): Likewise.
+       (mma_pm<apvi4i2): Likewise.
+       (mma_pm<vvi4i4i4>): Likewise.
+       (mma_pm<avvi4i4i4>): Likewise.
+
+==================== Branch mmaplus, patch #14 ====================
+
+RFC2653-Add support for dense math registers.
+
+The MMA subsystem added the notion of accumulator registers as an optional
+feature of ISA 3.1 (power10).  In ISA 3.1, these accumulators overlapped with
+the VSX registers 0..31, but logically the accumulator registers were separate
+from the FPR registers.  In ISA 3.1, it was anticipated that in future systems,
+the accumulator registers may no overlap with the FPR registers.  This patch
+adds the support for dense math registers as separate registers.
+
+This particular patch does not change the MMA support to use the accumulators
+within the dense math registers.  This patch just adds the basic support for
+having separate DMRs.  The next patch will switch the MMA support to use the
+accumulators if -mcpu=future is used.
+
+For testing purposes, I added an undocumented option '-mdense-math' to enable
+or disable the dense math support.
+
+This patch adds a new constraint (wD).  If MMA is selected but dense math is
+not selected (i.e. -mcpu=power10), the wD constraint will allow access to
+accumulators that overlap with VSX registers 0..31.  If both MMA and dense math
+are selected (i.e. -mcpu=future), the wD constraint will only allow dense math
+registers.
+
+This patch modifies the existing %A output modifier.  If MMA is selected but
+dense math is not selected, then %A output modifier converts the VSX register
+number to the accumulator number, by dividing it by 4.  If both MMA and dense
+math are selected, then %A will map the separate DMR registers into 0..7.
+
+The intention is that user code using extended asm can be modified to run on
+both MMA without dense math and MMA with dense math:
+
+    1) If possible, don't use extended asm, but instead use the MMA built-in
+       functions;
+
+    2) If you do need to write extended asm, change the d constraints
+       targetting accumulators should now use wD;
+
+    3) Only use the built-in zero, assemble and disassemble functions create
+       move data between vector quad types and dense math accumulators.
+       I.e. do not use the xxmfacc, xxmtacc, and xxsetaccz directly in the
+       extended asm code.  The reason is these instructions assume there is a
+       1-to-1 correspondence between 4 adjacent FPR registers and an
+       accumulator that overlaps with those instructions.  With accumulators
+       now being separate registers, there no longer is a 1-to-1
+       correspondence.
+
+It is possible that the mangling for DMRs and the GDB register numbers may
+produce other changes in the future.
+
+gcc/
+
+2024-10-22   Michael Meissner  <meiss...@linux.ibm.com>
+
+       * config/rs6000/mma.md (UNSPEC_MMA_DMSETDMRZ): New unspec.
+       (movxo): Add comments about dense math registers.
+       (movxo_nodm): Rename from movxo and restrict the usage to machines
+       without dense math registers.
+       (movxo_dm): New insn for movxo support for machines with dense math
+       registers.
+       (mma_<acc>): Restrict usage to machines without dense math registers.
+       (mma_xxsetaccz): Add a define_expand wrapper, and add support for dense
+       math registers.
+       (mma_dmsetaccz): New insn.
+       * config/rs6000/predicates.md (dmr_operand): New predicate.
+       (accumulator_operand): Add support for dense math registers.
+       * config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_mma_builtin): Do
+       not issue a de-prime instruction when disassembling a vector quad on a
+       system with dense math registers.
+       * config/rs6000/rs6000-c.cc (rs6000_define_or_undefine_macro): Define
+       __DENSE_MATH__ if we have dense math registers.
+       * config/rs6000/rs6000.cc (enum rs6000_reg_type): Add DMR_REG_TYPE.
+       (enum rs6000_reload_reg_type): Add RELOAD_REG_DMR.
+       (LAST_RELOAD_REG_CLASS): Add support for DMR registers and the wD
+       constraint.
+       (reload_reg_map): Likewise.
+       (rs6000_reg_names): Likewise.
+       (alt_reg_names): Likewise.
+       (rs6000_hard_regno_nregs_internal): Likewise.
+       (rs6000_hard_regno_mode_ok_uncached): Likewise.
+       (rs6000_debug_reg_global): Likewise.
+       (rs6000_setup_reg_addr_masks): Likewise.
+       (rs6000_init_hard_regno_mode_ok): Likewise.
+       (rs6000_secondary_reload_memory): Add support for DMR registers.
+       (rs6000_secondary_reload_simple_move): Likewise.
+       (rs6000_preferred_reload_class): Likewise.
+       (rs6000_secondary_reload_class): Likewise.
+       (print_operand): Make %A handle both FPRs and DMRs.
+       (rs6000_dmr_register_move_cost): New helper function.
+       (rs6000_register_move_cost): Add support for DMR registers.
+       (rs6000_memory_move_cost): Likewise.
+       (rs6000_compute_pressure_classes): Likewise.
+       (rs6000_debugger_regno): Likewise.
+       (rs6000_split_multireg_move): Add support for DMRs.
+       * config/rs6000/rs6000.h (TARGET_DENSE_MATH): New macro.
+       (TARGET_MMA_DENSE_MATH): Likewise.
+       (TARGET_MMA_NO_DENSE_MATH): Likewise
+       (UNITS_PER_DMR_WORD): Likewise.
+       (FIRST_PSEUDO_REGISTER): Update for DMRs.
+       (FIXED_REGISTERS): Add DMRs.
+       (CALL_REALLY_USED_REGISTERS): Likewise.
+       (REG_ALLOC_ORDER): Likewise.
+       (DMR_REGNO_P): New macro.
+       (enum reg_class): Add DM_REGS.
+       (REG_CLASS_NAMES): Likewise.
+       (REG_CLASS_CONTENTS): Likewise.
+       (enum r6000_reg_class_enum): Add RS6000_CONSTRAINT_wD.
+       (REGISTER_NAMES): Add DMR registers.
+       (ADDITIONAL_REGISTER_NAMES): Likewise.
+       * config/rs6000/rs6000.md (FIRST_DMR_REGNO): New constant.
+       (LAST_DMR_REGNO): Likewise.
+
+==================== Branch mmaplus, patch #13 ====================
+
+RFC2653-Add wD constraint.
+
+This patch adds a new constraint ('wD') that matches the accumulator registers
+that overlap with VSX registers 0..31 on power10.  Future patches will add the
+support for a separate accumulator register class that will be used when the
+support for dense math registes is added.
+
+2024-10-22   Michael Meissner  <meiss...@linux.ibm.com>
+
+       * config/rs6000/constraints.md (wD): New constraint.
+       * config/rs6000/mma.md (mma_<acc>): Prepare for alternate accumulator
+       registers.  Use wD constraint instead of 'd' constraint.  Use
+       accumulator_operand instead of fpr_reg_operand.
+       (mma_<vv>): Likewise.
+       (mma_<avv>): Likewise.
+       (mma_<pv>): Likewise.
+       (mma_<apv>): Likewise.
+       (mma_<vvi4i4i8>): Likewise.
+       (mma_<avvi4i4i8>): Likewise.
+       (mma_<vvi4i4i2>): Likewise.
+       (mma_<avvi4i4i2>): Likewise.
+       (mma_<vvi4i4>): Likewise.
+       (mma_<avvi4i4>): Likewise.
+       (mma_<pvi4i2): Likewise.
+       (mma_<apvi4i2>): Likewise.
+       (mma_<vvi4i4i4>): Likewise.
+       (mma_<avvi4i4i4): Likewise.
+       * config/rs6000/predicates.md (accumulator_operand): New predicate.
+       * config/rs6000/rs6000.cc (rs6000_debug_reg_global): Print the register
+       class for the 'wD' constraint.
+       (rs6000_init_hard_regno_mode_ok): Set up the 'wD' register constraint
+       class.
+       * config/rs6000/rs6000.h (enum r6000_reg_class_enum): Add element for
+       the 'wD' constraint.
+       * doc/md.texi (PowerPC constraints): Document the 'wD' constraint.
+
+==================== Branch mmaplus, patch #12 ====================
+
+Use vector pair load/store for memcpy with -mcpu=future
+
+In the development for the power10 processor, GCC did not enable using the load
+vector pair and store vector pair instructions when optimizing things like
+memory copy.  This patch enables using those instructions if -mcpu=future is
+used.
+
+2024-10-22  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       * config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Enable using
+       load vector pair and store vector pair instructions for memory copy
+       operations.
+       (POWERPC_MASKS): Make the bit for enabling using load vector pair and
+       store vector pair operations set and reset when the PowerPC processor is
+       changed.
+
+==================== Branch mmaplus, patch #11 ====================
+
+Add -mcpu=future tuning support.
+
+This patch makes -mtune=future use the same tuning decision as -mtune=power11.
+
+2024-10-22  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       * config/rs6000/power10.md (all reservations): Add future as an
+       alterntive to power10 and power11.
+
+==================== Branch mmaplus, patch #10 ====================
+
+Add support for -mcpu=future
+
+This patch adds the support that can be used in developing GCC support for
+future PowerPC processors.
+
+2024-10-22  Michael Meissner  <meiss...@linux.ibm.com>
+
+       * config.gcc (powerpc*-*-*): Add support for --with-cpu=future.
+       * config/rs6000/aix71.h (ASM_CPU_SPEC): Add support for -mcpu=future.
+       * config/rs6000/aix72.h (ASM_CPU_SPEC): Likewise.
+       * config/rs6000/aix73.h (ASM_CPU_SPEC): Likewise.
+       * config/rs6000/driver-rs6000.cc (asm_names): Likewise.
+       * config/rs6000/rs6000-arch.def: Add future cpu.
+       * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): If
+       -mcpu=future, define _ARCH_FUTURE.
+       * config/rs6000/rs6000-cpus.def (FUTURE_MASKS_SERVER): New macro.
+       (future cpu): Define.
+       * config/rs6000/rs6000-opts.h (enum processor_type): Add
+       PROCESSOR_FUTURE.
+       * config/rs6000/rs6000-tables.opt: Regenerate.
+       * config/rs6000/rs6000.cc (power10_cost): Update comment.
+       (get_arch_flags): Add support for future processor.
+       (rs6000_option_override_internal): Likewise.
+       (rs6000_machine_from_flags): Likewise.
+       (rs6000_reassociation_width): Likewise.
+       (rs6000_adjust_cost): Likewise.
+       (rs6000_issue_rate): Likewise.
+       (rs6000_sched_reorder): Likewise.
+       (rs6000_sched_reorder2): Likewise.
+       (rs6000_register_move_cost): Likewise.
+       * config/rs6000/rs6000.h (ASM_CPU_SPEC): Likewise.
+       (TARGET_POWER11): New macro.
+       * config/rs6000/rs6000.md (cpu attribute): Likewise.
+
+==================== Branch mmaplus, patch #9 ====================
+
+Update tests to work with architecture flags changes.
+
+Two tests used -mvsx to raise the processor level to at least power7.  These
+tests were rewritten to add cpu=power7 support.
+
+I have built both big endian and little endian bootstrap compilers and there
+were no regressions.
+
+In addition, I constructed a test case that used every archiecture define (like
+_ARCH_PWR4, etc.) and I also looked at the .machine directive generated.  I ran
+this test for all supported combinations of -mcpu, big/little endian, and 32/64
+bit support.  Every single instance generated exactly the same code with the
+patches installed compared to the compiler before installing the patches.
+
+Can I install this patch on the GCC 15 trunk?
+
+2024-10-22  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/testsuite/
+
+       * gcc.target/powerpc/ppc-target-4.c: Rewrite the test to add cpu=power7
+       when we need to add VSX support.  Add test for adding cpu=power7 no-vsx
+       to generate only Altivec instructions.
+       * gcc.target/powerpc/pr115688.c: Add cpu=power7 when requesting VSX
+       instructions.
+
+==================== Branch mmaplus, patch #8 ====================
+
+Change TARGET_MODULO to TARGET_POWER9
+
+As part of the architecture flags patches, this patch changes the use of
+TARGET_MODULO to TARGET_POWER9.  The modulo instructions were added in power9 
(ISA
+3.0).  Note, I did not change the uses of TARGET_MODULO where it was explicitly
+generating different code if the machine had a modulo instruction.
+
+I have built both big endian and little endian bootstrap compilers and there
+were no regressions.
+
+In addition, I constructed a test case that used every archiecture define (like
+_ARCH_PWR4, etc.) and I also looked at the .machine directive generated.  I ran
+this test for all supported combinations of -mcpu, big/little endian, and 32/64
+bit support.  Every single instance generated exactly the same code with the
+patches installed compared to the compiler before installing the patches.
+
+Can I install this patch on the GCC 15 trunk?
+
+2024-10-22  Michael Meissner  <meiss...@linux.ibm.com>
+
+       * config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported): Use
+       TARGET_POWER9 instead of TARGET_MODULO.
+       * config/rs6000/rs6000.h (TARGET_CTZ): Likewise.
+       (TARGET_EXTSWSLI): Likewise.
+       (TARGET_MADDLD): Likewise.
+       * config/rs6000/rs6000.md (enabled attribute): Likewise.
+
+==================== Branch mmaplus, patch #7 ====================
+
+Change TARGET_POPCNTD to TARGET_POWER7
+
+As part of the architecture flags patches, this patch changes the use of
+TARGET_POPCNTD to TARGET_POWER7.  The POPCNTD instruction was added in power7
+(ISA 2.06).
+
+I have built both big endian and little endian bootstrap compilers and there
+were no regressions.
+
+In addition, I constructed a test case that used every archiecture define (like
+_ARCH_PWR4, etc.) and I also looked at the .machine directive generated.  I ran
+this test for all supported combinations of -mcpu, big/little endian, and 32/64
+bit support.  Every single instance generated exactly the same code with the
+patches installed compared to the compiler before installing the patches.
+
+Can I install this patch on the GCC 15 trunk?
+
+2024-10-22  Michael Meissner  <meiss...@linux.ibm.com>
+
+       * config/rs6000/dfp.md (floatdidd2): Change TARGET_POPCNTD to
+       TARGET_POWER7.
+       * config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported):
+       Likewise.
+       * config/rs6000/rs6000-string.cc (expand_block_compare_gpr): Likewise.
+       * config/rs6000/rs6000.cc (rs6000_hard_regno_mode_ok_uncached):
+       Likewise.
+       (rs6000_rtx_costs): Likewise.
+       (rs6000_emit_popcount): Likewise.
+       * config/rs6000/rs6000.h (TARGET_LDBRX): Likewise.
+       (TARGET_LFIWZX): Likewise.
+       (TARGET_FCFIDS): Likewise.
+       (TARGET_FCFIDU): Likewise.
+       (TARGET_FCFIDUS): Likewise.
+       (TARGET_FCTIDUZ): Likewise.
+       (TARGET_FCTIWUZ): Likewise.
+       (CTZ_DEFINED_VALUE_AT_ZERO): Likewise.
+       * config/rs6000/rs6000.md (enabled attribute): Likewise.
+       (ctz<mode>2): Likewise.
+       (popcntd<mode>2): Likewise.
+       (lrint<mode>si2): Likewise.
+       (lrint<mode>si): Likewise.
+       (lrint<mode>si_di): Likewise.
+       (cmpmemsi): Likewise.
+       (bpermd_<mode>"): Likewise.
+       (addg6s): Likewise.
+       (cdtbcd): Likewise.
+       (cbcdtd): Likewise.
+       (div<div_extend>_<mode>): Likewise.
+
+==================== Branch mmaplus, patch #6 ====================
+
+Change TARGET_CMPB to TARGET_POWER6
+
+As part of the architecture flags patches, this patch changes the use of
+TARGET_FPRND to TARGET_POWER6.  The CMPB instruction was added in power6 (ISA
+2.05).
+
+I have built both big endian and little endian bootstrap compilers and there
+were no regressions.
+
+In addition, I constructed a test case that used every archiecture define (like
+_ARCH_PWR4, etc.) and I also looked at the .machine directive generated.  I ran
+this test for all supported combinations of -mcpu, big/little endian, and 32/64
+bit support.  Every single instance generated exactly the same code with the
+patches installed compared to the compiler before installing the patches.
+
+Can I install this patch on the GCC 15 trunk?
+
+2024-10-22  Michael Meissner  <meiss...@linux.ibm.com>
+
+       * config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported): Use
+       TARGET_POWER6 instead of TARGET_CMPB.
+       * config/rs6000/rs6000.h (TARGET_FCFID): Merge tests for popcntb, cmpb,
+       and popcntd into a single test for TARGET_POWER5.
+       (TARGET_LFIWAX): Use TARGET_POWER6 instead of TARGET_CMPB.
+       * config/rs6000/rs6000.md (enabled attribute): Likewise.
+       (parity<mode>2_cmp): Likewise.
+       (cmpb): Likewise.
+       (copysign<mode>3): Likewise.
+       (copysign<mode>3_fcpsgn): Likewise.
+       (cmpstrnsi): Likewise.
+       (cmpstrsi): Likewise.
+
+==================== Branch mmaplus, patch #5 ====================
+
+Change TARGET_FPRND to TARGET_POWER5X
+
+As part of the architecture flags patches, this patch changes the use of
+TARGET_FPRND to TARGET_POWER5X.  The FPRND instruction was added in power5+.
+
+I have built both big endian and little endian bootstrap compilers and there
+were no regressions.
+
+In addition, I constructed a test case that used every archiecture define (like
+_ARCH_PWR4, etc.) and I also looked at the .machine directive generated.  I ran
+this test for all supported combinations of -mcpu, big/little endian, and 32/64
+bit support.  Every single instance generated exactly the same code with the
+patches installed compared to the compiler before installing the patches.
+
+Can I install this patch on the GCC 15 trunk?
+
+2024-10-22  Michael Meissner  <meiss...@linux.ibm.com>
+
+       * config/rs6000/rs6000.cc (report_architecture_mismatch): Use
+       TARGET_POWER5X instead of TARGET_FPRND.
+       * config/rs6000/rs6000.md (fmod<mode>3): Use TARGET_POWER5X instead of
+       TARGET_FPRND.
+       (remainder<mode>3): Likewise.
+       (fctiwuz_<mode>): Likewise.
+       (btrunc<mode>2): Likewise.
+       (ceil<mode>2): Likewise.
+       (floor<mode>2): Likewise.
+       (round<mode>): Likewise.
+
+==================== Branch mmaplus, patch #4 ====================
+
+Change TARGET_POPCNTB to TARGET_POWER5
+
+As part of the architecture flags patches, this patch changes the use of
+TARGET_POPCNTB to TARGET_POWER5.  The POPCNTB instruction was added in ISA 2.02
+(power5).
+
+I have built both big endian and little endian bootstrap compilers and there
+were no regressions.
+
+In addition, I constructed a test case that used every archiecture define (like
+_ARCH_PWR4, etc.) and I also looked at the .machine directive generated.  I ran
+this test for all supported combinations of -mcpu, big/little endian, and 32/64
+bit support.  Every single instance generated exactly the same code with the
+patches installed compared to the compiler before installing the patches.
+
+Can I install this patch on the GCC 15 trunk?
+
+2024-10-22  Michael Meissner  <meiss...@linux.ibm.com>
+
+       * config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported): Use
+       TARGET_POWER5 instead of TARGET_POPCNTB.
+       * config/rs6000/rs6000.h (TARGET_EXTRA_BUILTINS): Use TARGET_POWER5
+       instead of TARGET_POPCNTB.  Eliminate TARGET_CMPB and TARGET_POPCNTD
+       tests since TARGET_POWER5 will always be true for those tests.
+       (TARGET_FRE): Use TARGET_POWER5 instead of TARGET_POPCNTB.
+       (TARGET_FRSQRTES): Likewise.
+       * config/rs6000/rs6000.md (enabled attribute): Likewise.
+       (popcount<mode>): Use TARGET_POWER5 instead of TARGET_POPCNTB.  Drop
+       test for TARGET_POPCNTD (i.e power7), since TARGET_POPCNTB will always
+       be set if TARGET_POPCNTD is set.
+       (popcntb<mode>2): Use TARGET_POWER5 instead of TARGET_POPCNTB.
+       (parity<mode>2): Likewise.
+       (parity<mode>2_cmpb): Remove TARGET_POPCNTB test, since it will always
+       be true when TARGET_CMPB (i.e. power6) is set.
+
+
+==================== Branch mmaplus, patch #3 ====================
+
+Do not allow -mvsx to boost processor to power7.
+
+This patch restructures the code so that -mvsx for example will not silently
+convert the processor to power7.  The user must now use -mcpu=power7 or higher.
+This means if the user does -mvsx and the default processor does not have VSX
+support, it will be an error.
+
+I have built both big endian and little endian bootstrap compilers and there
+were no regressions.
+
+In addition, I constructed a test case that used every archiecture define (like
+_ARCH_PWR4, etc.) and I also looked at the .machine directive generated.  I ran
+this test for all supported combinations of -mcpu, big/little endian, and 32/64
+bit support.  Every single instance generated exactly the same code with the
+patches installed compared to the compiler before installing the patches.
+
+Can I install this patch on the GCC 15 trunk?
+
+2024-10-22  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       * config/rs6000/rs6000.cc (report_architecture_mismatch): New function.
+       Report an error if the user used an option such as -mvsx when the
+       default processor would not allow the option.
+       (rs6000_option_override_internal): Move some ISA checking code into
+       report_architecture_mismatch.
+
+==================== Branch mmaplus, patch #2 ====================
+
+Use architecture flags for defining _ARCH_PWR macros.
+
+For the newer architectures, this patch changes GCC to define the _ARCH_PWR<n>
+macros using the new architecture flags instead of relying on isa options like
+-mpower10.
+
+The -mpower8-internal, -mpower10, and -mpower11 options were removed.  The
+-mpower11 option was removed completely, since it was just added in GCC 15.  
The
+other two options were marked as WarnRemoved, and the various ISA bits were
+removed.
+
+TARGET_POWER8 and TARGET_POWER10 were re-defined to use the architeture bits
+instead of the ISA bits.
+
+There are other internal isa bits that aren't removed with this patch because
+the built-in function support uses those bits.
+
+I have built both big endian and little endian bootstrap compilers and there
+were no regressions.
+
+In addition, I constructed a test case that used every archiecture define (like
+_ARCH_PWR4, etc.) and I also looked at the .machine directive generated.  I ran
+this test for all supported combinations of -mcpu, big/little endian, and 32/64
+bit support.  Every single instance generated exactly the same code with the
+patches installed compared to the compiler before installing the patches.
+
+Can I install this patch on the GCC 15 trunk?
+
+2024-10-22  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros) Add support to
+       use architecture flags instead of ISA flags for setting most of the
+       _ARCH_PWR* macros.
+       (rs6000_cpu_cpp_builtins): Update rs6000_target_modify_macros call.
+       * config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Remove
+       OPTION_MASK_POWER8.
+       (ISA_3_1_MASKS_SERVER): Remove OPTION_MASK_POWER10.
+       (POWER11_MASKS_SERVER): Remove OPTION_MASK_POWER11.
+       (POWERPC_MASKS): Remove OPTION_MASK_POWER8, OPTION_MASK_POWER10, and
+       OPTION_MASK_POWER11.
+       * config/rs6000/rs6000-protos.h (rs6000_target_modify_macros): Update
+       declaration.
+       (rs6000_target_modify_macros_ptr): Likewise.
+       * config/rs6000/rs6000.cc (rs6000_target_modify_macros_ptr): Likewise.
+       (rs6000_option_override_internal): Use architecture flags instead of ISA
+       flags.
+       (rs6000_opt_masks): Remove -mpower10 and -mpower11, which are no longer
+       in the ISA flags.
+       (rs6000_pragma_target_parse): Use architecture flags as well as ISA
+       flags.
+       * config/rs6000/rs6000.h (TARGET_POWER4): New macro.
+       (TARGET_POWER5): Likewise.
+       (TARGET_POWER5X): Likewise.
+       (TARGET_POWER6): Likewise.
+       (TARGET_POWER7): Likewise.
+       (TARGET_POWER8): Likewise.
+       (TARGET_POWER9): Likewise.
+       (TARGET_POWER10): Likewise.
+       (TARGET_POWER11): Likewise.
+       * config/rs6000/rs6000.opt (-mpower8-internal): Remove ISA flag bits.
+       (-mpower10): Likewise.
+       (-mpower11): Likewise.
+
+==================== Branch mmaplus, patch #1 ====================
+
+Add rs6000 architecture masks.
+
+This patch begins the journey to move architecture bits that are not user ISA
+options from rs6000_isa_flags to a new targt variable rs6000_arch_flags.  The
+intention is to remove switches that are currently isa options, but the user
+should not be using this particular option. For example, we want users to use
+-mcpu=power10 and not just -mpower10.
+
+This patch also changes the target_clones support to use an architecture mask
+instead of isa bits.
+
+This patch also switches the handling of .machine to use architecture masks if
+they exist (power4 through power11).  All of the other PowerPCs will continue 
to
+use the existing code for setting the .machine option.
+
+I have built both big endian and little endian bootstrap compilers and there
+were no regressions.
+
+In addition, I constructed a test case that used every archiecture define (like
+_ARCH_PWR4, etc.) and I also looked at the .machine directive generated.  I ran
+this test for all supported combinations of -mcpu, big/little endian, and 32/64
+bit support.  Every single instance generated exactly the same code with the
+patches installed compared to the compiler before installing the patches.
+
+Can I install this patch on the GCC 15 trunk?
+
+2024-10-22  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       * config/rs6000/rs6000-arch.def: New file.
+       * config/rs6000/rs6000.cc (struct clone_map): Switch to using
+       architecture masks instead of ISA masks.
+       (rs6000_clone_map): Likewise.
+       (rs6000_print_isa_options): Add an architecture flags argument, change
+       all callers.
+       (get_arch_flag): New function.
+       (rs6000_debug_reg_global): Update rs6000_print_isa_options calls.
+       (rs6000_option_override_internal): Likewise.
+       (rs6000_machine_from_flags): Switch to using architecture masks instead
+       of ISA masks.
+       (struct rs6000_arch_mask): New structure.
+       (rs6000_arch_masks): New table of architecutre masks and names.
+       (rs6000_function_specific_save): Save architecture flags.
+       (rs6000_function_specific_restore): Restore architecture flags.
+       (rs6000_function_specific_print): Update rs6000_print_isa_options calls.
+       (rs6000_print_options_internal): Add architecture flags options.
+       (rs6000_clone_priority): Switch to using architecture masks instead of
+       ISA masks.
+       (rs6000_can_inline_p): Don't allow inling if the callee requires a newer
+       architecture than the caller.
+       * config/rs6000/rs6000.h: Use rs6000-arch.def to create the architecture
+       masks.
+       * config/rs6000/rs6000.opt (rs6000_arch_flags): New target variable.
+       (x_rs6000_arch_flags): New save/restore field for rs6000_arch_flags.
+
+==================== Branch mmaplus, baseline ====================
+
 Add ChangeLog.dmf and update REVISION.
 
 2024-10-22  Michael Meissner  <meiss...@linux.ibm.com>

[gcc(refs/vendors/ibm/heads/mmaplus)] Update ChangeLog.*

Reply via email to