[gcc r16-1406] diagnostics: make experimental-html sink prettier [PR116792]

2025-06-10 Thread David Malcolm via Gcc-cvs
https://gcc.gnu.org/g:cb1d203445c923aa64bca01b0ffb6d3d16a82130

commit r16-1406-gcb1d203445c923aa64bca01b0ffb6d3d16a82130
Author: David Malcolm 
Date:   Tue Jun 10 20:06:38 2025 -0400

diagnostics: make experimental-html sink prettier [PR116792]

This patch to the "experimental-html" diagnostic sink:
* adds use of the PatternFly 3 CSS library (via an optional link
  in the generated html to a copy in a CDN)
* uses PatternFly's "alert" pattern to show severities for diagnostics,
  properly nesting "note" diagnostics for diagnostic groups.
  Example:
before: 
https://dmalcolm.fedorapeople.org/gcc/2025-06-10/before/diagnostic-ranges.c.html
 after: 
https://dmalcolm.fedorapeople.org/gcc/2025-06-10/after/diagnostic-ranges.c.html

* adds initial support for logical locations and physical locations
* adds initial support for multi-level nested diagnostics such as those
  for C++ concepts diagnostics.  Ideally this would show a clickable
  disclosure widget to expand/collapse a level, but for now it uses
  nested  elements with  for the child diagnostics.
  Example:
before: 
https://dmalcolm.fedorapeople.org/gcc/2025-06-10/before/nested-diagnostics-1.C.html
 after: 
https://dmalcolm.fedorapeople.org/gcc/2025-06-10/after/nested-diagnostics-1.C.html

gcc/ChangeLog:
PR other/116792
* diagnostic-format-html.cc: Include "diagnostic-path.h" and
"diagnostic-client-data-hooks.h".
(html_builder::m_logical_loc_mgr): New field.
(html_builder::m_cur_nesting_levels): New field.
(html_builder::m_last_logical_location): New field.
(html_builder::m_last_location): New field.
(html_builder::m_last_expanded_location): New field.
(HTML_STYLE): Add "white-space: pre;" to .source and .annotation.
Add "gcc-quoted-text" CSS class.
(html_builder::html_builder): Initialize the new fields.  If CSS
is enabled, add CDN links to PatternFly 3 stylesheets.
(html_builder::add_stylesheet): New.
(html_builder::on_report_diagnostic): Add "alert" param to
make_element_for_diagnostic, setting it by default, but unsetting
it for nested diagnostics below the top level.  Use
add_at_nesting_level for nested diagnostics.
(add_nesting_level_attr): New.
(html_builder::add_at_nesting_level): New.
(get_pf_class_for_alert_div): New.
(get_pf_class_for_alert_icon): New.
(get_label_for_logical_location_kind): New.
(add_labelled_value): New.
(html_builder::make_element_for_diagnostic): Add leading comment.
Add "alert" param.  Drop class="gcc-diagnostic" from  tag,
instead adding the class for a PatternFly 3 alert if "alert" is
true, and adding a  with an alert icon, both according to
the diagnostic severity.  Add a severity prefix to the message for
alerts.  Add any metadata/option text as suffixes to the message.
Show any logical location.  Show any physical location.  Don't
show the locus if the last location is unchanged within the
diagnostic_group.  Wrap any execution path element in a
 and add a label to it.  Wrap any
generated patch in a  and add a label
to it.
(selftest::test_simple_log): Update expected HTML.

gcc/testsuite/ChangeLog:
PR other/116792
* gcc.dg/html-output/missing-semicolon.py: Update for changes
to diagnostic elements.
* gcc.dg/format/diagnostic-ranges-html.py: Likewise.
* gcc.dg/plugin/diagnostic-test-metadata-html.py: Likewise.  Drop
out-of-date comment.
* gcc.dg/plugin/diagnostic-test-paths-2.py: Likewise.
* gcc.dg/plugin/diagnostic-test-paths-4.py: Likewise.  Drop
out-of-date comment.
* gcc.dg/plugin/diagnostic-test-show-locus.py: Likewise.
* lib/htmltest.py (get_diag_by_index): Update to use search by id.
(get_message_within_diag): Update to use search by class.

libcpp/ChangeLog:
PR other/116792
* include/line-map.h (typedef expanded_location): Convert to...
(struct expanded_location): ...this.
(operator==): New decl, for expanded_location.
(operator!=): Likewise.
* line-map.cc (operator==): New decl, for expanded_location.

Signed-off-by: David Malcolm 

Diff:
---
 gcc/diagnostic-format-html.cc  | 433 +++--
 .../gcc.dg/format/diagnostic-ranges-html.py|  15 +-
 .../gcc.dg/html-output/missing-semicolon.py|  68 +++-
 .../gcc.dg/plugin/diagnostic-test-metadata-html.py |  38 +-
 .../gcc.dg/plugin/diagnostic-test-paths-2.py   |

[gcc r16-1405] diagnostics: xml: add add_text_from_pp

2025-06-10 Thread David Malcolm via Gcc-cvs
https://gcc.gnu.org/g:896edb1d0ae90ff1f60a6b894f04eb3c436790f5

commit r16-1405-g896edb1d0ae90ff1f60a6b894f04eb3c436790f5
Author: David Malcolm 
Date:   Tue Jun 10 20:06:38 2025 -0400

diagnostics: xml: add add_text_from_pp

Various places use
  xp.add_text (pp_formatted_text (&pp))
Add a helper function for this.
No functional change intended.

gcc/ChangeLog:
* diagnostic-path-output.cc: Use xml::printer::add_text_from_pp.
* diagnostic-show-locus.cc: Likewise.
* xml-printer.h (xml::printer::add_text_from_pp): New decl.
* xml.cc (xml::node_with_children::add_text_from_pp): New.
(xml::printer::add_text_from_pp): New.
* xml.h (xml::node_with_children::add_text_from_pp): New decl.

Signed-off-by: David Malcolm 

Diff:
---
 gcc/diagnostic-path-output.cc |  6 +++---
 gcc/diagnostic-show-locus.cc  |  2 +-
 gcc/xml-printer.h |  1 +
 gcc/xml.cc| 12 
 gcc/xml.h |  1 +
 5 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/gcc/diagnostic-path-output.cc b/gcc/diagnostic-path-output.cc
index 199028ea7f3a..bae24bf01a70 100644
--- a/gcc/diagnostic-path-output.cc
+++ b/gcc/diagnostic-path-output.cc
@@ -689,7 +689,7 @@ struct event_range
iter_event.print_desc (pp);
if (event_label_writer)
  event_label_writer->begin_label ();
-   xp.add_text (pp_formatted_text (&pp));
+   xp.add_text_from_pp (pp);
if (event_label_writer)
  event_label_writer->end_label ();
  }
@@ -1243,7 +1243,7 @@ print_path_summary_as_html (const path_summary &ps,
else
  pp_printf (&pp, "events %i-%i",
 range->m_start_idx + 1, range->m_end_idx + 1);
-   xp.add_text (pp_formatted_text (&pp));
+   xp.add_text_from_pp (pp);
xp.pop_tag ("span");
   }
   if (show_depths)
@@ -1252,7 +1252,7 @@ print_path_summary_as_html (const path_summary &ps,
  xp.push_tag_with_class ("span", "depth", true);
  pretty_printer pp;
  pp_printf (&pp, "(depth %i)", range->m_stack_depth);
- xp.add_text (pp_formatted_text (&pp));
+ xp.add_text_from_pp (pp);
  xp.pop_tag ("span");
}
   xp.pop_tag ("div");
diff --git a/gcc/diagnostic-show-locus.cc b/gcc/diagnostic-show-locus.cc
index 575c7ec8d709..ffb72da138d9 100644
--- a/gcc/diagnostic-show-locus.cc
+++ b/gcc/diagnostic-show-locus.cc
@@ -614,7 +614,7 @@ struct to_html
   {
 pp_clear_output_area (&m_scratch_pp);
 pp_unicode_character (&m_scratch_pp, ch);
-m_xp.add_text (pp_formatted_text (&m_scratch_pp));
+m_xp.add_text_from_pp (m_scratch_pp);
   }
 
   void add_utf8_byte (char b)
diff --git a/gcc/xml-printer.h b/gcc/xml-printer.h
index 24ac2f42e735..428da0a4245d 100644
--- a/gcc/xml-printer.h
+++ b/gcc/xml-printer.h
@@ -44,6 +44,7 @@ public:
   void set_attr (const char *name, std::string value);
 
   void add_text (std::string text);
+  void add_text_from_pp (pretty_printer &pp);
 
   void add_raw (std::string text);
 
diff --git a/gcc/xml.cc b/gcc/xml.cc
index 0a925619f5d3..9077c1ab1300 100644
--- a/gcc/xml.cc
+++ b/gcc/xml.cc
@@ -121,6 +121,11 @@ node_with_children::add_text (std::string str)
   add_child (std::make_unique  (std::move (str)));
 }
 
+void
+node_with_children::add_text_from_pp (pretty_printer &pp)
+{
+  add_text (pp_formatted_text (&pp));
+}
 
 /* struct document : public node_with_children.  */
 
@@ -251,6 +256,13 @@ printer::add_text (std::string text)
   parent->add_text (std::move (text));
 }
 
+void
+printer::add_text_from_pp (pretty_printer &pp)
+{
+  element *parent = m_open_tags.back ();
+  parent->add_text_from_pp (pp);
+}
+
 void
 printer::add_raw (std::string text)
 {
diff --git a/gcc/xml.h b/gcc/xml.h
index 3c5813a22862..952cfa4376b0 100644
--- a/gcc/xml.h
+++ b/gcc/xml.h
@@ -65,6 +65,7 @@ struct node_with_children : public node
 {
   void add_child (std::unique_ptr node);
   void add_text (std::string str);
+  void add_text_from_pp (pretty_printer &pp);
 
   std::vector> m_children;
 };


[gcc r16-1403] gimple-ssa-warn-access: add missing auto_diagnostic_group

2025-06-10 Thread David Malcolm via Gcc-cvs
https://gcc.gnu.org/g:b619b4d7e7a5078d4fe8b1c4e89258ce4d21be4d

commit r16-1403-gb619b4d7e7a5078d4fe8b1c4e89258ce4d21be4d
Author: David Malcolm 
Date:   Tue Jun 10 20:06:37 2025 -0400

gimple-ssa-warn-access: add missing auto_diagnostic_group

Spotted whilst implementing nesting support in the
experimental-html diagnostic sink.

gcc/ChangeLog:
* gimple-ssa-warn-access.cc
(pass_waccess::maybe_check_dealloc_call): Add missing
auto_diagnostic_group to nest the "returned from %qD"
note within the warning.

Signed-off-by: David Malcolm 

Diff:
---
 gcc/gimple-ssa-warn-access.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc
index 305b63567fea..0f4aff6b59b5 100644
--- a/gcc/gimple-ssa-warn-access.cc
+++ b/gcc/gimple-ssa-warn-access.cc
@@ -3767,6 +3767,7 @@ pass_waccess::maybe_check_dealloc_call (gcall *call)
 
   if (is_gimple_call (def_stmt))
{
+ auto_diagnostic_group d;
  bool warned = false;
  if (gimple_call_alloc_p (def_stmt))
{


[gcc r16-1404] diagnostics: fix tag nesting issues in experimental-html sink [PR120610]

2025-06-10 Thread David Malcolm via Gcc-cvs
https://gcc.gnu.org/g:3dcce649a1e0833a4c3bb9ced4b9c0b38c3fb8a5

commit r16-1404-g3dcce649a1e0833a4c3bb9ced4b9c0b38c3fb8a5
Author: David Malcolm 
Date:   Tue Jun 10 20:06:37 2025 -0400

diagnostics: fix tag nesting issues in experimental-html sink [PR120610]

I've been seeing issues in the experimental-html sink where the nesting
of tags goes wrong.

The two issues I've seen are:
* the pp_token_list from the diagnostic message that reaches the
  html_token_printer doesn't always have matching pairs of begin/end
  tokens (PR other/120610)
* a bug in diagnostic-show-locus where there was a stray xp.pop_tag,
  in print_trailing_fixits.

This patch:
* changes the xml::printer::pop_tag API so that it now takes the
  expected name of the element being popped (rather than expressing this
  in comments), and that, by default, the xml::printer asserts that this
  matches.
* gives the html_token_printer its own xml::printer instance to restrict
  the affected area of the DOM tree; this xml::printer doesn't enforce
  nesting (PR other/120610)
* adds RAII sentinel classes that automatically check for pushes/pops
  being balanced within a scope, using them in various places
* fixes the bug in print_trailing_fixits for html output

gcc/ChangeLog:
PR other/120610
* diagnostic-format-html.cc (html_builder::html_builder): Update
for new param of xml::printer::pop_tag.
(html_path_label_writer::end_label): Likewise.
(html_builder::make_element_for_diagnostic::html_token_printer):
Give the instance its own xml::printer.  Update for new param of
xml::printer::pop_tag.
(html_builder::make_element_for_diagnostic): Give the instance its
own xml::printer.
(html_builder::make_metadata_element): Update for new param of
xml::printer::pop_tag.
(html_builder::flush_to_file): Likewise.
* diagnostic-path-output.cc (begin_html_stack_frame): Likewise.
(begin_html_stack_frame): Likewise.
(end_html_stack_frame): Likewise.
(print_path_summary_as_html): Likewise.
* diagnostic-show-locus.cc
(struct to_text::auto_check_tag_nesting): New.
(struct to_html:: auto_check_tag_nesting): New.
(to_text::pop_html_tag): Change param to const char *.
(to_html::pop_html_tag): Likewise; rename param to
"expected_name".
(default_diagnostic_start_span_fn): Update for new param
of xml::printer::pop_tag.
(layout_printer::end_label): Likewise.
(layout_printer::print_trailing_fixits): Add RAII sentinel
to check tag nesting for the HTML case.  Delete stray popping
of "td" in the presence of fix-it hints.
(layout_printer::print_line): Add RAII sentinel
to check tag nesting for the HTML case.
(diagnostic_source_print_policy::print_as_html): Likewise.
(layout_printer::print): Likewise.
* xml-printer.h (xml::printer::printer): Add optional
"check_popped_tags" param.
(xml::printer::pop_tag): Add "expected_name" param.
(xml::printer::get_num_open_tags): New accessor.
(xml::printer::dump): New decl.
(xml::printer::m_check_popped_tags): New field.
(class xml::auto_check_tag_nesting): New.
(class xml::auto_print_element): Update for new param of pop_tag.
* xml.cc: Move pragma pop so that the pragma also covers
xml::printer's member functions, "dump" in particular.
(xml::printer::printer): Add param "check_popped_tags".
(xml::printer::pop_tag): Add param "expected_name" and use it to 
assert
that the popped tag is as expected.  Assert that we have a tag to
pop.
(xml::printer::dump): New.
(selftest::test_printer): Update for new param of pop_tag.
(selftest::test_attribute_ordering): Likewise.

gcc/testsuite/ChangeLog:
PR other/120610
* gcc.dg/format/diagnostic-ranges-html.py: Remove out-of-date
comment.

Signed-off-by: David Malcolm 

Diff:
---
 gcc/diagnostic-format-html.cc  | 29 --
 gcc/diagnostic-path-output.cc  | 28 ++---
 gcc/diagnostic-show-locus.cc   | 36 +
 .../gcc.dg/format/diagnostic-ranges-html.py| 23 ---
 gcc/xml-printer.h  | 41 ---
 gcc/xml.cc | 46 --
 6 files changed, 131 insertions(+), 72 deletions(-)

diff --git a/gcc/diagnostic-format-html.cc b/gcc/diagnostic-format-html.cc
index ddf6ba0f4cfc..6010712b8a5b 100644
--- a/

[gcc r15-9818] libstdc++: Fix std::format thousands separators when sign present [PR120548]

2025-06-10 Thread Jonathan Wakely via Libstdc++-cvs
https://gcc.gnu.org/g:974d59aec69b35bd4f7f8464a3bcfc55e849ed1f

commit r15-9818-g974d59aec69b35bd4f7f8464a3bcfc55e849ed1f
Author: Jonathan Wakely 
Date:   Wed Jun 4 18:22:28 2025 +0100

libstdc++: Fix std::format thousands separators when sign present [PR120548]

The leading sign character should be skipped when deciding whether to
insert thousands separators into a floating-point format.

libstdc++-v3/ChangeLog:

PR libstdc++/120548
* include/std/format (__formatter_fp::_M_localize): Do not
include a leading sign character in the string to be grouped.
* testsuite/std/format/functions/format.cc: Check grouping when
sign is present in the output.

Reviewed-by: Tomasz Kamiński 
(cherry picked from commit 2c3559839d70df6311da18fd93237050405580c3)

Diff:
---
 libstdc++-v3/include/std/format   | 11 +--
 libstdc++-v3/testsuite/std/format/functions/format.cc | 10 ++
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/std/format b/libstdc++-v3/include/std/format
index 8beef93c7809..d25df1224ffc 100644
--- a/libstdc++-v3/include/std/format
+++ b/libstdc++-v3/include/std/format
@@ -2362,9 +2362,16 @@ namespace __format
const size_t __r = __str.size() - __e; // Length of remainder.
auto __overwrite = [&](_CharT* __p, size_t) {
  // Apply grouping to the digits before the radix or exponent.
- auto __end = std::__add_grouping(__p, __np.thousands_sep(),
+ int __off = 0;
+ if (auto __c = __str.front(); __c == '-' || __c == '+' || __c == ' ')
+   {
+ *__p = __c;
+ __off = 1;
+   }
+ auto __end = std::__add_grouping(__p + __off, __np.thousands_sep(),
   __grp.data(), __grp.size(),
-  __str.data(), __str.data() + __e);
+  __str.data() + __off,
+  __str.data() + __e);
  if (__r) // If there's a fractional part or exponent
{
  if (__d != __str.npos)
diff --git a/libstdc++-v3/testsuite/std/format/functions/format.cc 
b/libstdc++-v3/testsuite/std/format/functions/format.cc
index 93c33b456e64..6f6f1f15b35b 100644
--- a/libstdc++-v3/testsuite/std/format/functions/format.cc
+++ b/libstdc++-v3/testsuite/std/format/functions/format.cc
@@ -260,6 +260,16 @@ test_locale()
   s = std::format(eloc, "{0:Le} {0:Lf} {0:Lg}", -nan);
   VERIFY( s == "-nan -nan -nan" );
 
+  // PR libstdc++/120548 format confuses a negative sign for a thousands digit
+  s = std::format(bloc, "{:L}", -123.45);
+  VERIFY( s == "-123.45" );
+  s = std::format(bloc, "{:-L}", -876543.21);
+  VERIFY( s == "-876,543.21" );
+  s = std::format(bloc, "{:+L}", 333.22);
+  VERIFY( s == "+333.22" );
+  s = std::format(bloc, "{: L}", 999.44);
+  VERIFY( s == " 999.44" );
+
   // Restore
   std::locale::global(cloc);
 }


[gcc r15-9819] libstdc++: Make system_clock::to_time_t always_inline [PR99832]

2025-06-10 Thread Jonathan Wakely via Gcc-cvs
https://gcc.gnu.org/g:5327eef7b003f66b90841af77c5095eebfa53938

commit r15-9819-g5327eef7b003f66b90841af77c5095eebfa53938
Author: Jonathan Wakely 
Date:   Wed May 28 15:19:18 2025 +0100

libstdc++: Make system_clock::to_time_t always_inline [PR99832]

For some 32-bit targets Glibc supports changing the size of time_t to be
64 bits by defining _TIME_BITS=64. That causes an ABI change which
would affect std::chrono::system_clock::to_time_t. Because to_time_t is
not a function template, its mangled name does not depend on the return
type, so it has the same mangled name whether it returns a 32-bit time_t
or a 64-bit time_t. On targets where the size of time_t can be selected
at preprocessing time, that can cause ODR violations, e.g. the linker
selects a definition of to_time_t that returns a 32-bit value but a
caller expects 64-bit and so reads 32 bits of garbage from the stack.

This commit adds always_inline to to_time_t so that all callers inline
the conversion to time_t, and will do so using whatever type time_t
happens to be in that translation unit.

Existing objects compiled before this change will either have inlined
the function anyway (which is likely if compiled with any optimization
enabled) or will contain a COMDAT definition of the inline function and
so still be able to find it at link-time.

The attribute is also added to system_clock::from_time_t, because that's
an equally simple function and it seems reasonable for them to both be
always inlined.

libstdc++-v3/ChangeLog:

PR libstdc++/99832
* include/bits/chrono.h (system_clock::to_time_t): Add
always_inline attribute to be agnostic to the underlying type of
time_t.
(system_clock::from_time_t): Add always_inline for consistency
with to_time_t.
* testsuite/20_util/system_clock/99832.cc: New test.

(cherry picked from commit d045eb13b0b42870a1f081895df3901112a358f0)

Diff:
---
 libstdc++-v3/include/bits/chrono.h   |  2 ++
 libstdc++-v3/testsuite/20_util/system_clock/99832.cc | 14 ++
 2 files changed, 16 insertions(+)

diff --git a/libstdc++-v3/include/bits/chrono.h 
b/libstdc++-v3/include/bits/chrono.h
index fad216203d2f..8de8e756c714 100644
--- a/libstdc++-v3/include/bits/chrono.h
+++ b/libstdc++-v3/include/bits/chrono.h
@@ -1244,6 +1244,7 @@ _GLIBCXX_BEGIN_INLINE_ABI_NAMESPACE(_V2)
   now() noexcept;
 
   // Map to C API
+  [[__gnu__::__always_inline__]]
   static std::time_t
   to_time_t(const time_point& __t) noexcept
   {
@@ -1251,6 +1252,7 @@ _GLIBCXX_BEGIN_INLINE_ABI_NAMESPACE(_V2)
   (__t.time_since_epoch()).count());
   }
 
+  [[__gnu__::__always_inline__]]
   static time_point
   from_time_t(std::time_t __t) noexcept
   {
diff --git a/libstdc++-v3/testsuite/20_util/system_clock/99832.cc 
b/libstdc++-v3/testsuite/20_util/system_clock/99832.cc
new file mode 100644
index ..693d4d647d9b
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/system_clock/99832.cc
@@ -0,0 +1,14 @@
+// { dg-options "-O0 -g0" }
+// { dg-do compile { target c++20 } }
+// { dg-final { scan-assembler-not "system_clock9to_time_t" } }
+
+// Bug libstdc++/99832
+// std::chrono::system_clock::to_time_t needs ABI tag for 32-bit time_t
+
+#include 
+
+std::time_t
+test_pr99832(std::chrono::system_clock::time_point t)
+{
+  return std::chrono::system_clock::to_time_t(t);
+}


[gcc r13-9750] libstdc++: Fix incorrect links to archived SGI STL docs

2025-06-10 Thread Jonathan Wakely via Gcc-cvs
https://gcc.gnu.org/g:d97de614f7a3729f7a841f48748d0b3bf746d0a2

commit r13-9750-gd97de614f7a3729f7a841f48748d0b3bf746d0a2
Author: Jonathan Wakely 
Date:   Tue May 20 10:53:41 2025 +0100

libstdc++: Fix incorrect links to archived SGI STL docs

In r8--g25949ee33201f2 I updated some URLs to point to copies of the
SGI STL docs in the Wayback Machine, because the original pags were no
longer hosted on sgi.com. However, I incorrectly assumed that if one
archived page was at https://web.archive.org/web/20171225062613/... then
all the other pages would be too. Apparently that's not how the Wayback
Machine works, and each page is archived on a different date. That meant
that some of our links were redirecting to archived copies of the
announcement that the SGI STL docs have gone away.

This fixes each URL to refer to a correctly archived copy of the
original docs.

libstdc++-v3/ChangeLog:

* doc/xml/faq.xml: Update URL for archived SGI STL docs.
* doc/xml/manual/containers.xml: Likewise.
* doc/xml/manual/extensions.xml: Likewise.
* doc/xml/manual/using.xml: Likewise.
* doc/xml/manual/utilities.xml: Likewise.
* doc/html/*: Regenerate.

(cherry picked from commit 501e6e786652748ff0ad9a322f74b9b47970031f)

Diff:
---
 libstdc++-v3/doc/html/faq.html  |  2 +-
 libstdc++-v3/doc/html/manual/containers.html|  2 +-
 libstdc++-v3/doc/html/manual/ext_numerics.html  |  2 +-
 libstdc++-v3/doc/html/manual/ext_sgi.html   |  4 ++--
 libstdc++-v3/doc/html/manual/using_concurrency.html | 10 +-
 libstdc++-v3/doc/html/manual/utilities.html |  4 ++--
 libstdc++-v3/doc/xml/faq.xml|  2 +-
 libstdc++-v3/doc/xml/manual/containers.xml  |  2 +-
 libstdc++-v3/doc/xml/manual/extensions.xml  |  6 +++---
 libstdc++-v3/doc/xml/manual/using.xml   | 10 +-
 libstdc++-v3/doc/xml/manual/utilities.xml   |  4 ++--
 11 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/libstdc++-v3/doc/html/faq.html b/libstdc++-v3/doc/html/faq.html
index 2e913920d05d..d7ad0934cfb9 100644
--- a/libstdc++-v3/doc/html/faq.html
+++ b/libstdc++-v3/doc/html/faq.html
@@ -797,7 +797,7 @@
 Libstdc++-v3 incorporates a lot of code from
 https://web.archive.org/web/20171225062613/http://www.sgi.com/tech/stl/"; 
target="_top">the SGI STL
 (the final merge was from
-https://web.archive.org/web/20171225062613/http://www.sgi.com/tech/stl/whats_new.html";
 target="_top">release 3.3).
+https://web.archive.org/web/20171206110416/http://www.sgi.com/tech/stl/whats_new.html";
 target="_top">release 3.3).
 The code in libstdc++ contains many fixes and changes compared to the
 original SGI code.
 
diff --git a/libstdc++-v3/doc/html/manual/containers.html 
b/libstdc++-v3/doc/html/manual/containers.html
index 7035a949074d..dcd609a6000d 100644
--- a/libstdc++-v3/doc/html/manual/containers.html
+++ b/libstdc++-v3/doc/html/manual/containers.html
@@ -11,7 +11,7 @@
  Yes it is, at least using the old
  ABI, and that's okay.  This is a decision that we preserved
  when we imported SGI's STL implementation.  The following is
- quoted from https://web.archive.org/web/20171225062613/http://www.sgi.com/tech/stl/FAQ.html";
 target="_top">their FAQ:
+ quoted from https://web.archive.org/web/20161222192301/http://www.sgi.com/tech/stl/FAQ.html";
 target="_top">their FAQ:

The size() member function, for list and slist, takes time
proportional to the number of elements in the list.  This was a
diff --git a/libstdc++-v3/doc/html/manual/ext_numerics.html 
b/libstdc++-v3/doc/html/manual/ext_numerics.html
index 9b864e1dcf4a..c3a5623d1752 100644
--- a/libstdc++-v3/doc/html/manual/ext_numerics.html
+++ b/libstdc++-v3/doc/html/manual/ext_numerics.html
@@ -14,7 +14,7 @@
The operation functor must be associative.
 The iota function wins the award for 
Extension With the
Coolest Name (the name comes from Ken Iverson's APL language.)  As
-   described in the https://web.archive.org/web/20171225062613/http://www.sgi.com/tech/stl/iota.html";
 target="_top">SGI
+   described in the https://web.archive.org/web/20170201044840/http://www.sgi.com/tech/stl/iota.html";
 target="_top">SGI
documentation, it "assigns sequentially increasing values to a range.
That is, it assigns value to *first,
value + 1 to *(first + 
1) and so on."
diff --git a/libstdc++-v3/doc/html/manual/ext_sgi.html 
b/libstdc++-v3/doc/html/manual/ext_sgi.html
index ae2062954f4f..2310857804b3 100644
--- a/libstdc++-v3/doc/html/manual/ext_sgi.html
+++ b/libstdc++-v3/doc/html/manual/ext_sgi.html
@@ -28,12 +28,12 @@
   and sets.
Each of the associative containers map, multimap, set, and multiset
   have a counterpart which uses a
-  https://web.archive.org/web/20171225062613/http:

[gcc r14-11834] libstdc++: Fix incorrect links to archived SGI STL docs

2025-06-10 Thread Jonathan Wakely via Libstdc++-cvs
https://gcc.gnu.org/g:e1cbf566970f02a7ac110df58c412be11604b278

commit r14-11834-ge1cbf566970f02a7ac110df58c412be11604b278
Author: Jonathan Wakely 
Date:   Tue May 20 10:53:41 2025 +0100

libstdc++: Fix incorrect links to archived SGI STL docs

In r8--g25949ee33201f2 I updated some URLs to point to copies of the
SGI STL docs in the Wayback Machine, because the original pags were no
longer hosted on sgi.com. However, I incorrectly assumed that if one
archived page was at https://web.archive.org/web/20171225062613/... then
all the other pages would be too. Apparently that's not how the Wayback
Machine works, and each page is archived on a different date. That meant
that some of our links were redirecting to archived copies of the
announcement that the SGI STL docs have gone away.

This fixes each URL to refer to a correctly archived copy of the
original docs.

libstdc++-v3/ChangeLog:

* doc/xml/faq.xml: Update URL for archived SGI STL docs.
* doc/xml/manual/containers.xml: Likewise.
* doc/xml/manual/extensions.xml: Likewise.
* doc/xml/manual/using.xml: Likewise.
* doc/xml/manual/utilities.xml: Likewise.
* doc/html/*: Regenerate.

(cherry picked from commit 501e6e786652748ff0ad9a322f74b9b47970031f)

Diff:
---
 libstdc++-v3/doc/html/faq.html  |  2 +-
 libstdc++-v3/doc/html/manual/containers.html|  2 +-
 libstdc++-v3/doc/html/manual/ext_numerics.html  |  2 +-
 libstdc++-v3/doc/html/manual/ext_sgi.html   |  4 ++--
 libstdc++-v3/doc/html/manual/using_concurrency.html | 10 +-
 libstdc++-v3/doc/html/manual/utilities.html |  4 ++--
 libstdc++-v3/doc/xml/faq.xml|  2 +-
 libstdc++-v3/doc/xml/manual/containers.xml  |  2 +-
 libstdc++-v3/doc/xml/manual/extensions.xml  |  6 +++---
 libstdc++-v3/doc/xml/manual/using.xml   | 10 +-
 libstdc++-v3/doc/xml/manual/utilities.xml   |  4 ++--
 11 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/libstdc++-v3/doc/html/faq.html b/libstdc++-v3/doc/html/faq.html
index bbe716d5e233..622137939335 100644
--- a/libstdc++-v3/doc/html/faq.html
+++ b/libstdc++-v3/doc/html/faq.html
@@ -796,7 +796,7 @@
 Libstdc++-v3 incorporates a lot of code from
 https://web.archive.org/web/20171225062613/http://www.sgi.com/tech/stl/"; 
target="_top">the SGI STL
 (the final merge was from
-https://web.archive.org/web/20171225062613/http://www.sgi.com/tech/stl/whats_new.html";
 target="_top">release 3.3).
+https://web.archive.org/web/20171206110416/http://www.sgi.com/tech/stl/whats_new.html";
 target="_top">release 3.3).
 The code in libstdc++ contains many fixes and changes compared to the
 original SGI code.
 
diff --git a/libstdc++-v3/doc/html/manual/containers.html 
b/libstdc++-v3/doc/html/manual/containers.html
index 7035a949074d..dcd609a6000d 100644
--- a/libstdc++-v3/doc/html/manual/containers.html
+++ b/libstdc++-v3/doc/html/manual/containers.html
@@ -11,7 +11,7 @@
  Yes it is, at least using the old
  ABI, and that's okay.  This is a decision that we preserved
  when we imported SGI's STL implementation.  The following is
- quoted from https://web.archive.org/web/20171225062613/http://www.sgi.com/tech/stl/FAQ.html";
 target="_top">their FAQ:
+ quoted from https://web.archive.org/web/20161222192301/http://www.sgi.com/tech/stl/FAQ.html";
 target="_top">their FAQ:

The size() member function, for list and slist, takes time
proportional to the number of elements in the list.  This was a
diff --git a/libstdc++-v3/doc/html/manual/ext_numerics.html 
b/libstdc++-v3/doc/html/manual/ext_numerics.html
index 9b864e1dcf4a..c3a5623d1752 100644
--- a/libstdc++-v3/doc/html/manual/ext_numerics.html
+++ b/libstdc++-v3/doc/html/manual/ext_numerics.html
@@ -14,7 +14,7 @@
The operation functor must be associative.
 The iota function wins the award for 
Extension With the
Coolest Name (the name comes from Ken Iverson's APL language.)  As
-   described in the https://web.archive.org/web/20171225062613/http://www.sgi.com/tech/stl/iota.html";
 target="_top">SGI
+   described in the https://web.archive.org/web/20170201044840/http://www.sgi.com/tech/stl/iota.html";
 target="_top">SGI
documentation, it "assigns sequentially increasing values to a range.
That is, it assigns value to *first,
value + 1 to *(first + 
1) and so on."
diff --git a/libstdc++-v3/doc/html/manual/ext_sgi.html 
b/libstdc++-v3/doc/html/manual/ext_sgi.html
index ae2062954f4f..2310857804b3 100644
--- a/libstdc++-v3/doc/html/manual/ext_sgi.html
+++ b/libstdc++-v3/doc/html/manual/ext_sgi.html
@@ -28,12 +28,12 @@
   and sets.
Each of the associative containers map, multimap, set, and multiset
   have a counterpart which uses a
-  https://web.archive.org/web/20171225062613/http

[gcc r16-1401] More API for IPA profile manipulation

2025-06-10 Thread Jan Hubicka via Gcc-cvs
https://gcc.gnu.org/g:e416c8097fc87513e05c2d104c63488f733758c0

commit r16-1401-ge416c8097fc87513e05c2d104c63488f733758c0
Author: Jan Hubicka 
Date:   Tue Jun 10 21:32:40 2025 +0200

More API for IPA profile manipulation

This patch attempts to make IPA profile manipulation easier.  It introduces

 node->scale_profile_to (count)
which can be used to scale profile to a given IPA count.
If IPA count is zero, then local profile is preserved and proper variant
of global0 count is used.

 node->make_profile_local
this can be used to drop IPA profile but keep local profile

 node->make_profile_global0
this can be used to make IPA profile 0 but keep local
profile.

Most of this can be accomplished by existing apply_scale.  I.e.
 - node->scale_profile_to (count)
 corresponds to
node->apply_scale (count, node->count),
 - node->make_profile_local
 corresponds to
   node->apply_scale (node->count.guessed_local (), node->count)

I think new API is more clean about what intention is and less error prone.
Also it handles some side cases when entry block of profile happens to be 0,
but body is non-zero (by profile inconsistencies).  In this case the scaling
API did kind of random things.

I noticed three bugs in ipa-cp (two already in released GCCs while one mine
introduced by last patch):

@@ -4528,7 +4528,7 @@ lenient_count_portion_handling (profile_count 
remainder, cgraph_node *orig_node)
   if (remainder.ipa_p () && !remainder.ipa ().nonzero_p ()
   && orig_node->count.ipa_p () && orig_node->count.ipa ().nonzero_p ()
   && opt_for_fn (orig_node->decl, flag_profile_partial_training))
-remainder = remainder.guessed_local ();
+remainder = orig_node->count.guessed_local ();

The code was intended to drop IPA profile to local when remainder is 0.
In this case orig_node->count is some non-zero count but all of control flow
was redirected to a clone which means that remainer is 0 (adjusted).
Doing

 remainder = remainder.guessed_local ();

will turn it into 0 (guessed_local) and the scalling will then multiply
all counts by 0 and turn them tinto guessed local.
We want to keep original count but reduce the quality. i.e.

 remainder = orig_node->count.guessed_local ();

Second problem is:
   /* TODO: Profile has alreay gone astray, keep what we have but lower 
it
 to global0 category.  */
   remainder = orig_node->count.global0 ();
global0 means that converting to ipa count will be precise 0. Since we lost 
track
it should be adjusted 0 :)

Finally in:
  new_sum = orig_node_count.combine_with_ipa_count (new_sum);
  orig_node->count = remainder;

  new_node->apply_scale (new_sum, new_node->count);

   if (!orig_edges_processed)
orig_node->apply_scale (remainder, orig_node->count);
orig_node->scale_profile_to (remainder);

orig_node->count is first set to remainder and then scalling is done
(which in turn does nothing).

This is bug I introduced in last path which should have removed
orig_node->count = remainder.  As a result now counts of cgraph edges are 
not
adjusted correctly.  I am sorry for that.

gcc/ChangeLog:

* cgraph.cc (cgraph_node::make_profile_local): New member function.
(cgraph_node::make_profile_global0): New member function.
(cgraph_node::apply_scale): Do not call adjust_for_ipa_scalling.
(cgraph_node::scale_profile_to): New member function.
* cgraph.h (cgraph_node::make_profile_local,
cgraph_node::make_profile_global0, cgraph_node::scale_profile_to):
Declare.
* ipa-cp.cc (lenient_count_portion_handling): Fix logic dropping 
count
to local.
(update_counts_for_self_gen_clones): Use scale_profile_to.
(update_profiling_info): Use make_profile_local, 
make_profile_global0
and scale_profile_to.
(update_specialized_profile): Likewise.
* ipa-inline-transform.cc (clone_inlined_nodes): Call
adjust_for_ipa_scalling.

Diff:
---
 gcc/cgraph.cc   | 114 +---
 gcc/cgraph.h|  14 +-
 gcc/ipa-cp.cc   |  53 ++--
 gcc/ipa-inline-transform.cc |   5 +-
 4 files changed, 140 insertions(+), 46 deletions(-)

diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
index 4a037a7bab10..2f31260207df 100644
--- a/gcc/cgraph.cc
+++ b/gcc/cgraph.cc
@@ -179,26 +179,128 @@ cgraph_node::function_version (void)
   return cgraph_fnver_htab->find (&key);
 }
 
-/* Scale profile by NUM/DEN.  Walk into inlined clones.  */
+/* If profile is IPA, turn it into local one.  */
+void
+cgraph_node::make_pr

[gcc/devel/omp/gcc-15] libgomp: Add OpenMP's omp_target_memset/omp_target_memset_async

2025-06-10 Thread Tobias Burnus via Gcc-cvs
https://gcc.gnu.org/g:7704131525574cd28bf3f779da1e1057c46a1f25

commit 7704131525574cd28bf3f779da1e1057c46a1f25
Author: Tobias Burnus 
Date:   Mon Jun 2 17:43:57 2025 +0200

libgomp: Add OpenMP's omp_target_memset/omp_target_memset_async

PR libgomp/120444

include/ChangeLog:

* cuda/cuda.h (cuMemsetD8, cuMemsetD8Async): Declare.

libgomp/ChangeLog:

* libgomp-plugin.h (GOMP_OFFLOAD_memset): Declare.
* libgomp.h (struct gomp_device_descr): Add memset_func.
* libgomp.map (GOMP_6.0.1): Add omp_target_memset{,_async}.
* libgomp.texi (Device Memory Routines): Document them.
* omp.h.in (omp_target_memset, omp_target_memset_async): Declare.
* omp_lib.f90.in (omp_target_memset, omp_target_memset_async):
Add interfaces.
* omp_lib.h.in (omp_target_memset, omp_target_memset_async): 
Likewise.
* plugin/cuda-lib.def: Add cuMemsetD8.
* plugin/plugin-gcn.c (struct hsa_runtime_fn_info): Add
hsa_amd_memory_fill_fn.
(init_hsa_runtime_functions): DLSYM_OPT_FN load it.
(GOMP_OFFLOAD_memset): New.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_memset): New.
* target.c (omp_target_memset_int, omp_target_memset,
omp_target_memset_async_helper, omp_target_memset_async): New.
(gomp_load_plugin_for_device): Add DLSYM (memset).
* testsuite/libgomp.c-c++-common/omp_target_memset.c: New test.
* testsuite/libgomp.c-c++-common/omp_target_memset-2.c: New test.
* testsuite/libgomp.c-c++-common/omp_target_memset-3.c: New test.
* testsuite/libgomp.fortran/omp_target_memset.f90: New test.
* testsuite/libgomp.fortran/omp_target_memset-2.f90: New test.

(cherry picked from commit 4e47e2f833732c5d9a3c3e69dc753f99b3a56737)

Diff:
---
 include/cuda/cuda.h|  3 +
 libgomp/libgomp-plugin.h   |  1 +
 libgomp/libgomp.h  |  3 +-
 libgomp/libgomp.map|  6 ++
 libgomp/libgomp.texi   | 98 +-
 libgomp/omp.h.in   |  4 +
 libgomp/omp_lib.f90.in | 23 +
 libgomp/omp_lib.h.in   | 25 ++
 libgomp/plugin/cuda-lib.def|  1 +
 libgomp/plugin/plugin-gcn.c| 80 ++
 libgomp/plugin/plugin-nvptx.c  |  9 ++
 libgomp/target.c   | 83 ++
 .../libgomp.c-c++-common/omp_target_memset-2.c | 62 ++
 .../libgomp.c-c++-common/omp_target_memset-3.c | 80 ++
 .../libgomp.c-c++-common/omp_target_memset.c   | 62 ++
 .../libgomp.fortran/omp_target_memset-2.f90| 67 +++
 .../libgomp.fortran/omp_target_memset.f90  | 39 +
 17 files changed, 642 insertions(+), 4 deletions(-)

diff --git a/include/cuda/cuda.h b/include/cuda/cuda.h
index 5e4b7f190ebf..6be1ac0ab438 100644
--- a/include/cuda/cuda.h
+++ b/include/cuda/cuda.h
@@ -279,6 +279,9 @@ CUresult cuMemcpy3D (const CUDA_MEMCPY3D *);
 CUresult cuMemcpy3DAsync (const CUDA_MEMCPY3D *, CUstream);
 CUresult cuMemcpy3DPeer (const CUDA_MEMCPY3D_PEER *);
 CUresult cuMemcpy3DPeerAsync (const CUDA_MEMCPY3D_PEER *, CUstream);
+#define cuMemsetD8 cuMemsetD8_v2
+CUresult cuMemsetD8 (CUdeviceptr, unsigned char, size_t);
+CUresult cuMemsetD8Async (CUdeviceptr, unsigned char, size_t, CUstream);
 #define cuMemFree cuMemFree_v2
 CUresult cuMemFree (CUdeviceptr);
 CUresult cuMemFreeHost (void *);
diff --git a/libgomp/libgomp-plugin.h b/libgomp/libgomp-plugin.h
index 3c7741bcef88..d0bcc237d7fe 100644
--- a/libgomp/libgomp-plugin.h
+++ b/libgomp/libgomp-plugin.h
@@ -179,6 +179,7 @@ extern int GOMP_OFFLOAD_memcpy3d (int, int, size_t, size_t, 
size_t, void *,
  size_t, size_t, size_t, size_t, size_t,
  const void *, size_t, size_t, size_t, size_t,
  size_t);
+extern bool GOMP_OFFLOAD_memset (int, void *, int, size_t);
 extern bool GOMP_OFFLOAD_can_run (void *);
 extern void GOMP_OFFLOAD_run (int, void *, void *, void **);
 extern void GOMP_OFFLOAD_async_run (int, void *, void *, void **, void *);
diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index 571ac62ca998..465f7c1b4ea5 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -1441,9 +1441,10 @@ struct gomp_device_descr
   __typeof (GOMP_OFFLOAD_page_locked_host_free) *page_locked_host_free_func;
   __typeof (GOMP_OFFLOAD_dev2host) *dev2host_func;
   __typeof (GOMP_OFFLOAD_host2dev) *host2dev_func;
+  __typeof (GOMP_OFFLOAD_dev2dev) *dev2dev_func;
   __typeof (GOMP_OFFLOAD_memcpy2d) *memcpy2d_func;
   __typeof (GOMP_OFFLOAD_m

[gcc/devel/omp/gcc-15] gcn: Add experimental MI300 (gfx942) support

2025-06-10 Thread Tobias Burnus via Gcc-cvs
https://gcc.gnu.org/g:5e75ec7168fd3ea5b7791ed67f25a29b44967fc3

commit 5e75ec7168fd3ea5b7791ed67f25a29b44967fc3
Author: Tobias Burnus 
Date:   Tue Jun 10 15:12:47 2025 +0200

gcn: Add experimental MI300 (gfx942) support

As gfx942 and gfx950 belong to gfx9-4-generic, the latter two are also 
added.
Note that there are no specific optimizations for MI300, yet.

For none of the mentioned devices, any multilib is build by default; use
'--with-multilib-list=' when configuring GCC to build them alongside.
gfx942 was added in LLVM (and its mc assembler, used by GCC) in version 18,
generic support in LLVM 19 and gfx950 in LLVM 20.

gcc/ChangeLog:

* config/gcn/gcn-devices.def: Add gfx942, gfx950 and gfx9-4-generic.
* config/gcn/gcn-opts.h (TARGET_CDNA3, TARGET_CDNA3_PLUS,
TARGET_GLC_NAME, TARGET_TARGET_SC_CACHE): Define.
(TARGET_ARCHITECTED_FLAT_SCRATCH): Use also for CDNA3.
* config/gcn/gcn.h (gcn_isa): Add ISA_CDNA3 to the enum.
* config/gcn/gcn.cc (print_operand): Update 'g' to use
TARGET_GLC_NAME; add 'G' to print TARGET_GLC_NAME unconditionally.
* config/gcn/gcn-valu.md (scatter, gather): Use TARGET_GLC_NAME.
* config/gcn/gcn.md: Use %G instead of glc; use 'buffer_inv 
sc1'
for TARGET_TARGET_SC_CACHE.
* doc/invoke.texi (march): Add gfx942, gfx950 and gfx9-4-generic.
* doc/install.texi (amdgcn*-*-*): Add gfx942, gfx950 and 
gfx9-4-generic.
* config/gcn/gcn-tables.opt: Regenerate.

libgomp/ChangeLog:

* testsuite/libgomp.c/declare-variant-4.h (gfx942): New variant 
function.
* testsuite/libgomp.c/declare-variant-4-gfx942.c: New test.

(cherry picked from commit 37b454b7e171bd8a792cbe4c57ea0f9702afa22d)

Diff:
---
 gcc/config/gcn/gcn-devices.def |  33 
 gcc/config/gcn/gcn-opts.h  |  13 +-
 gcc/config/gcn/gcn-tables.opt  |   9 ++
 gcc/config/gcn/gcn-valu.md |   8 +-
 gcc/config/gcn/gcn.cc  |   8 +-
 gcc/config/gcn/gcn.h   |   2 +
 gcc/config/gcn/gcn.md  | 168 +
 gcc/doc/install.texi   |  17 ++-
 gcc/doc/invoke.texi|  10 ++
 .../testsuite/libgomp.c/declare-variant-4-gfx942.c |   8 +
 libgomp/testsuite/libgomp.c/declare-variant-4.h|   8 +
 11 files changed, 208 insertions(+), 76 deletions(-)

diff --git a/gcc/config/gcn/gcn-devices.def b/gcc/config/gcn/gcn-devices.def
index af1420382e2f..426acf0cb7a5 100644
--- a/gcc/config/gcn/gcn-devices.def
+++ b/gcc/config/gcn/gcn-devices.def
@@ -171,6 +171,28 @@ GCN_DEVICE(gfx90c, GFX90C, 0x32, ISA_GCN5,
   /* Generic Name */ GFX9_GENERIC
   )
 
+GCN_DEVICE(gfx942, GFX942, 0x4c, ISA_CDNA3,
+  /* XNACK default */ HSACO_ATTR_ANY,
+  /* SRAM_ECC default */ HSACO_ATTR_ANY,
+  /* WAVE64 mode */ HSACO_ATTR_UNSUPPORTED,
+  /* CU mode */ HSACO_ATTR_UNSUPPORTED,
+  /* Max ISA VGPRs */ 512,
+  /* Generic code obj version */ 0,  /* non-generic */
+  /* Architecture Family */ GFX9,
+  /* Generic Name */ NONE
+  )
+
+GCN_DEVICE(gfx950, GFX950, 0x4f, ISA_CDNA3,
+  /* XNACK default */ HSACO_ATTR_ANY,
+  /* SRAM_ECC default */ HSACO_ATTR_ANY,
+  /* WAVE64 mode */ HSACO_ATTR_UNSUPPORTED,
+  /* CU mode */ HSACO_ATTR_UNSUPPORTED,
+  /* Max ISA VGPRs */ 512,
+  /* Generic code obj version */ 0,  /* non-generic */
+  /* Architecture Family */ GFX9,
+  /* Generic Name */ NONE
+  )
+
 GCN_DEVICE(gfx9-generic, GFX9_GENERIC, 0x051, ISA_GCN5,
   /* XNACK default */ HSACO_ATTR_ANY,
   /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
@@ -182,6 +204,17 @@ GCN_DEVICE(gfx9-generic, GFX9_GENERIC, 0x051, ISA_GCN5,
   /* Generic Name */ NONE
   )
 
+GCN_DEVICE(gfx9-4-generic, GFX9_4_GENERIC, 0x05f, ISA_CDNA3,
+  /* XNACK default */ HSACO_ATTR_ANY,
+  /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+  /* WAVE64 mode */ HSACO_ATTR_UNSUPPORTED,
+  /* CU mode */ HSACO_ATTR_UNSUPPORTED,
+  /* Max ISA VGPRs */ 256,
+  /* Generic code obj version */ 1,
+  /* Architecture Family */ GFX9,
+  /* Generic Name */ NONE
+  )
+
 /* GCN GFX10.3 (RDNA 2) */
 
 GCN_DEVICE(gfx1030, GFX1030, 0x36, ISA_RDNA2,
diff --git a/gcc/config/gcn/gcn-opts.h b/gcc/config/gcn/gcn-opts.h
index 88f562dfc1e1..bcea14f3fe7a 100644
--- a/gcc/config/gcn/gcn-opts.h
+++ b/gcc/config/gcn/gcn-opts.h
@@ -33,7 +33,8 @@ extern enum gcn_isa {
   ISA_RDNA2,
   ISA_RDNA3,
   ISA_CDNA1,
-  ISA_CDNA2
+  ISA_CDNA2,
+  ISA_CDNA3
 } gcn_isa;
 
 #define TARGET_GCN5 (gcn_isa == ISA_GCN5)
@@ -41,6 +42,8 @@ exter

[gcc/devel/omp/gcc-15] Merge branch 'releases/gcc-15' into devel/omp/gcc-15

2025-06-10 Thread Tobias Burnus via Gcc-cvs
https://gcc.gnu.org/g:682e7678f3d2b5b974bf564deea7a405f0fd37bf

commit 682e7678f3d2b5b974bf564deea7a405f0fd37bf
Merge: f34abf47bf57 5327eef7b003
Author: Tobias Burnus 
Date:   Tue Jun 10 21:56:49 2025 +0200

Merge branch 'releases/gcc-15' into devel/omp/gcc-15

Merge up to r15-9819-g5327eef7b003f6 (June 10, 2025)

Diff:

 gcc/ChangeLog  |  95 +++
 gcc/DATESTAMP  |   2 +-
 gcc/ada/ChangeLog  | 112 +++
 gcc/ada/checks.adb |  15 +-
 gcc/ada/contracts.adb  | 103 +--
 gcc/ada/einfo.ads  |   2 +-
 gcc/ada/exp_aggr.adb   |  16 +-
 gcc/ada/exp_attr.adb   |   9 +-
 gcc/ada/exp_ch4.adb|  62 +-
 gcc/ada/exp_ch5.adb|  24 +-
 gcc/ada/exp_util.adb   | 148 +++-
 gcc/ada/exp_util.ads   |  18 +-
 gcc/ada/freeze.adb |  11 +-
 gcc/ada/libgnarl/s-stusta.adb  |   5 +-
 gcc/ada/sem_case.adb   |   8 +-
 gcc/ada/sem_ch10.adb   |   2 +
 gcc/ada/sem_ch12.adb   |  15 +-
 gcc/ada/sem_ch3.adb|   4 +-
 gcc/ada/sem_ch4.adb| 911 +++--
 gcc/ada/sem_prag.adb   |   9 +-
 gcc/cp/ChangeLog   |  16 +
 gcc/cp/constexpr.cc|   3 +-
 gcc/cp/cp-gimplify.cc  |  21 +-
 gcc/cp/decl2.cc|  33 +-
 gcc/ext-dce.cc |  17 +-
 gcc/testsuite/ChangeLog|  75 ++
 gcc/testsuite/g++.dg/cpp1z/constexpr-if39.C|  30 +
 gcc/testsuite/g++.dg/cpp2a/constexpr-prvalue2.C|  26 +
 gcc/tree-vectorizer.h  |   1 +
 libstdc++-v3/ChangeLog |  19 +
 libstdc++-v3/include/bits/chrono.h |   2 +
 libstdc++-v3/include/std/format|  11 +-
 .../testsuite/20_util/system_clock/99832.cc|  14 +
 .../testsuite/std/format/functions/format.cc   |  10 +
 34 files changed, 1414 insertions(+), 435 deletions(-)


[gcc/devel/omp/gcc-15] ChangeLog.omp bump

2025-06-10 Thread Tobias Burnus via Gcc-cvs
https://gcc.gnu.org/g:a6a5a2674c5c7f2ae64277c7f79a3b8c20a87fc6

commit a6a5a2674c5c7f2ae64277c7f79a3b8c20a87fc6
Author: Tobias Burnus 
Date:   Tue Jun 10 21:57:52 2025 +0200

ChangeLog.omp bump

Diff:
---
 gcc/ChangeLog.omp | 19 +++
 gcc/DATESTAMP.omp |  2 +-
 include/ChangeLog.omp |  8 
 libgomp/ChangeLog.omp | 37 +
 4 files changed, 65 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.omp b/gcc/ChangeLog.omp
index 6ac795bf4c33..9934978ef5b4 100644
--- a/gcc/ChangeLog.omp
+++ b/gcc/ChangeLog.omp
@@ -1,3 +1,22 @@
+2025-06-10  Tobias Burnus  
+
+   Backported from master:
+   2025-06-10  Tobias Burnus  
+
+   * config/gcn/gcn-devices.def: Add gfx942, gfx950 and gfx9-4-generic.
+   * config/gcn/gcn-opts.h (TARGET_CDNA3, TARGET_CDNA3_PLUS,
+   TARGET_GLC_NAME, TARGET_TARGET_SC_CACHE): Define.
+   (TARGET_ARCHITECTED_FLAT_SCRATCH): Use also for CDNA3.
+   * config/gcn/gcn.h (gcn_isa): Add ISA_CDNA3 to the enum.
+   * config/gcn/gcn.cc (print_operand): Update 'g' to use
+   TARGET_GLC_NAME; add 'G' to print TARGET_GLC_NAME unconditionally.
+   * config/gcn/gcn-valu.md (scatter, gather): Use TARGET_GLC_NAME.
+   * config/gcn/gcn.md: Use %G instead of glc; use 'buffer_inv sc1'
+   for TARGET_TARGET_SC_CACHE.
+   * doc/invoke.texi (march): Add gfx942, gfx950 and gfx9-4-generic.
+   * doc/install.texi (amdgcn*-*-*): Add gfx942, gfx950 and gfx9-4-generic.
+   * config/gcn/gcn-tables.opt: Regenerate.
+
 2025-06-06  Tobias Burnus  
 
Backported from master:
diff --git a/gcc/DATESTAMP.omp b/gcc/DATESTAMP.omp
index c6de4e349988..52988ae3b03d 100644
--- a/gcc/DATESTAMP.omp
+++ b/gcc/DATESTAMP.omp
@@ -1 +1 @@
-20250606
+20250610
diff --git a/include/ChangeLog.omp b/include/ChangeLog.omp
index 74413c262e62..7a8f2810132f 100644
--- a/include/ChangeLog.omp
+++ b/include/ChangeLog.omp
@@ -1,3 +1,11 @@
+2025-06-10  Tobias Burnus  
+
+   Backported from master:
+   2025-06-02  Tobias Burnus  
+
+   PR libgomp/120444
+   * cuda/cuda.h (cuMemsetD8, cuMemsetD8Async): Declare.
+
 2025-05-15  Julian Brown  
 
* gomp-constants.h (gomp_map_kind): Add GOMP_MAP_TO_GRID,
diff --git a/libgomp/ChangeLog.omp b/libgomp/ChangeLog.omp
index e25761590956..2bf31a9d2180 100644
--- a/libgomp/ChangeLog.omp
+++ b/libgomp/ChangeLog.omp
@@ -1,3 +1,40 @@
+2025-06-10  Tobias Burnus  
+
+   Backported from master:
+   2025-06-10  Tobias Burnus  
+
+   * testsuite/libgomp.c/declare-variant-4.h (gfx942): New variant 
function.
+   * testsuite/libgomp.c/declare-variant-4-gfx942.c: New test.
+
+2025-06-10  Tobias Burnus  
+
+   Backported from master:
+   2025-06-02  Tobias Burnus  
+
+   PR libgomp/120444
+   * libgomp-plugin.h (GOMP_OFFLOAD_memset): Declare.
+   * libgomp.h (struct gomp_device_descr): Add memset_func.
+   * libgomp.map (GOMP_6.0.1): Add omp_target_memset{,_async}.
+   * libgomp.texi (Device Memory Routines): Document them.
+   * omp.h.in (omp_target_memset, omp_target_memset_async): Declare.
+   * omp_lib.f90.in (omp_target_memset, omp_target_memset_async):
+   Add interfaces.
+   * omp_lib.h.in (omp_target_memset, omp_target_memset_async): Likewise.
+   * plugin/cuda-lib.def: Add cuMemsetD8.
+   * plugin/plugin-gcn.c (struct hsa_runtime_fn_info): Add
+   hsa_amd_memory_fill_fn.
+   (init_hsa_runtime_functions): DLSYM_OPT_FN load it.
+   (GOMP_OFFLOAD_memset): New.
+   * plugin/plugin-nvptx.c (GOMP_OFFLOAD_memset): New.
+   * target.c (omp_target_memset_int, omp_target_memset,
+   omp_target_memset_async_helper, omp_target_memset_async): New.
+   (gomp_load_plugin_for_device): Add DLSYM (memset).
+   * testsuite/libgomp.c-c++-common/omp_target_memset.c: New test.
+   * testsuite/libgomp.c-c++-common/omp_target_memset-2.c: New test.
+   * testsuite/libgomp.c-c++-common/omp_target_memset-3.c: New test.
+   * testsuite/libgomp.fortran/omp_target_memset.f90: New test.
+   * testsuite/libgomp.fortran/omp_target_memset-2.f90: New test.
+
 2025-06-06  Tobias Burnus  
 
Backported from master:


[gcc/devel/omp/gcc-15] (33 commits) ChangeLog.omp bump

2025-06-10 Thread Tobias Burnus via Gcc-cvs
The branch 'devel/omp/gcc-15' was updated to point to:

 a6a5a2674c5c... ChangeLog.omp bump

It previously pointed to:

 f34abf47bf57... ChangeLog.omp bump

Diff:

Summary of changes (added commits):
---

  a6a5a26... ChangeLog.omp bump
  5e75ec7... gcn: Add experimental MI300 (gfx942) support
  7704131... libgomp: Add OpenMP's omp_target_memset/omp_target_memset_a
  682e767... Merge branch 'releases/gcc-15' into devel/omp/gcc-15
  5327eef... libstdc++: Make system_clock::to_time_t always_inline [PR99 (*)
  974d59a... libstdc++: Fix std::format thousands separators when sign p (*)
  615a92a... vectorizer: Fix riscv build [PR120042] (*)
  a35f642... ada: Error on subtype with static predicate used in case_ex (*)
  e249cec... ada: Fix fallout of latest change (*)
  d02a2fe... ada: Fix wrong initialization of library-level object by co (*)
  4aca5bc... ada: Storage_Error on Ordered_Maps container aggregate with (*)
  8a4b72a... ada: Fix infinite loop with aggregate in generic unit (*)
  2859883... ada: Fix use-after-free in Compute_All_Tasks (*)
  ba729e2... ext-dce: Don't refine live width with SUBREG mode if !TRULY (*)
  62724ea... Daily bump. (*)
  e4940c0... c++: recursive template with deduced return [PR120555] (*)
  4e4684c... c++: constexpr prvalues vs genericize [PR120502] (*)
  d96603a... ada: Support fixed-lower-bound array types as generic actua (*)
  6cc5c01... ada: Reject component-related aspects on formal non-array t (*)
  2fd267b... ada: Fix glitch in handling of Atomic_Components on generic (*)
  f59c4d4... ada: Missing discriminant check on assignment of Bounded_Ve (*)
  e68026c... ada: Check validity using signedness from the type and not  (*)
  8a63f6b... ada: Incorrect creation of corresponding expression of clas (*)
  823e973... ada: Fix spurious error on anonymous array initialized by c (*)
  c8934b1... Daily bump. (*)
  4caedcd... Daily bump. (*)
  69eb171... Daily bump. (*)
  f59d33a... ada: Constant_Indexing used when context requires a variabl (*)
  e0777e7... ada: Fix libgpr2 build failure with compiler built with ass (*)
  cb3e765... ada: Fix wrong initialization of library-level object by co (*)
  1189522... ada: Incorrect unresolved operator name in an instantiation (*)
  855fe36... ada: Fix internal error on allocator involving interface ty (*)
  649bde8... ada: Fix for validity checking of limited scalar types (*)

(*) This commit already exists in another branch.
Because the reference `refs/heads/devel/omp/gcc-15' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:141ce9e514fba385685c8c46587719fc2ddf464e

commit 141ce9e514fba385685c8c46587719fc2ddf464e
Author: Michael Meissner 
Date:   Tue Jun 10 16:05:20 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector/vector nor/and fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 161419b7f586..ed15fccdf760 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -1949,20 +1949,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vnor -> vand
 (define_insn "*fuse_vnor_vand"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(and:VM (and:VM (not:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v"))
-  (not:VM (match_operand:VM 1 
"altivec_register_operand" "v,v,v,v")))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(and:VM (and:VM (not:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v"))
+  (not:VM (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v")))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vnor %3,%1,%0\;vand %3,%3,%2
vnor %3,%1,%0\;vand %3,%3,%2
vnor %3,%1,%0\;vand %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,8
vnor %4,%1,%0\;vand %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vor -> vand
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 9d3a01a4704a..40d62ae8e9c1 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -219,6 +219,7 @@ sub gen_logical_addsubf
   "vandc_vand"  =>   2,
   "vxor_vand"   =>   6,
   "vor_vand"=>   7,
+  "vnor_vand"   =>   8,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:6ee9c4088f72ad5a26d56e8e405ddfcea3aba769

commit 6ee9c4088f72ad5a26d56e8e405ddfcea3aba769
Author: Michael Meissner 
Date:   Tue Jun 10 16:02:26 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector/vector or/and fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 6375cd3a8970..161419b7f586 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -1967,20 +1967,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vor -> vand
 (define_insn "*fuse_vor_vand"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(and:VM (ior:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v")
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v"))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(and:VM (ior:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v")
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v"))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vor %3,%1,%0\;vand %3,%3,%2
vor %3,%1,%0\;vand %3,%3,%2
vor %3,%1,%0\;vand %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,7
vor %4,%1,%0\;vand %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vorc -> vand
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 2c631b944587..9d3a01a4704a 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -218,6 +218,7 @@ sub gen_logical_addsubf
   "vand_vand"   =>   1,
   "vandc_vand"  =>   2,
   "vxor_vand"   =>   6,
+  "vor_vand"=>   7,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:fd2acd0a00a7ce79e474e5cc55d071601c54d2db

commit fd2acd0a00a7ce79e474e5cc55d071601c54d2db
Author: Michael Meissner 
Date:   Tue Jun 10 15:59:17 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector/vector xor/and fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index b9590b6d1104..6375cd3a8970 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2003,20 +2003,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vxor -> vand
 (define_insn "*fuse_vxor_vand"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(and:VM (xor:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v")
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v"))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(and:VM (xor:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v")
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v"))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vxor %3,%1,%0\;vand %3,%3,%2
vxor %3,%1,%0\;vand %3,%3,%2
vxor %3,%1,%0\;vand %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,6
vxor %4,%1,%0\;vand %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vand -> vandc
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 23adf98c4056..2c631b944587 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -217,6 +217,7 @@ sub gen_logical_addsubf
 my %xxeval_fusions = (
   "vand_vand"   =>   1,
   "vandc_vand"  =>   2,
+  "vxor_vand"   =>   6,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:742340663eeb49c659420603e1d5a73579316004

commit 742340663eeb49c659420603e1d5a73579316004
Author: Michael Meissner 
Date:   Tue Jun 10 15:45:21 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector/vector and/and fusion if XXEVAL is supported.
* config/rs6000/predicates.md (vector_fusion_operand): New 
predicate.
* config/rs6000/rs6000.h (TARGET_XXEVAL): New macro.
* config/rs6000/rs6000.md (isa attribute): Add xxeval.
(enabled attribute): Add support for XXEVAL support.

Diff:
---
 gcc/config/rs6000/fusion.md | 15 ++-
 gcc/config/rs6000/genfusion.pl  | 58 ++---
 gcc/config/rs6000/predicates.md | 12 +
 gcc/config/rs6000/rs6000.h  |  4 +++
 gcc/config/rs6000/rs6000.md |  7 -
 5 files changed, 85 insertions(+), 11 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 621b346f9eb9..d24837d68d83 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -1871,20 +1871,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vand -> vand
 (define_insn "*fuse_vand_vand"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(and:VM (and:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v")
-  (match_operand:VM 1 "altivec_register_operand" 
"%v,v,v,v"))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(and:VM (and:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v")
+  (match_operand:VM 1 "vector_fusion_operand" 
"%v,v,v,wa,v"))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vand %3,%1,%0\;vand %3,%3,%2
vand %3,%1,%0\;vand %3,%3,%2
vand %3,%1,%0\;vand %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,1
vand %4,%1,%0\;vand %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vandc -> vand
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index e5d3b1ee449d..351a4d914a4a 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -211,25 +211,33 @@ sub gen_logical_addsubf
$inner_comp, $inner_inv, $inner_rtl, $inner_op, $both_commute, $c4,
$bc, $inner_arg0, $inner_arg1, $inner_exp, $outer_arg2, $outer_exp,
$ftype, $insn, $is_subf, $is_rsubf, $outer_32, $outer_42,$outer_name,
-   $fuse_type);
-  KIND: foreach $kind ('scalar','vector') {
+   $fuse_type, $xxeval, $c5, $vect_pred, $vect_inner_arg0, 
$vect_inner_arg1,
+   $vect_inner_exp, $vect_outer_arg2, $vect_outer_exp);
+
+my %xxeval_fusions = (
+  "vand_vand"   =>   1,
+);
+
+KIND: foreach $kind ('scalar','vector') {
   @outer_ops = @logicals;
   if ( $kind eq 'vector' ) {
  $vchr = "v";
  $mode = "VM";
  $pred = "altivec_register_operand";
+ $vect_pred = "vector_fusion_operand";
  $constraint = "v";
  $fuse_type = "fused_vector";
   } else {
  $vchr = "";
  $mode = "GPR";
- $pred = "gpc_reg_operand";
+ $vect_pred = $pred = "gpc_reg_operand";
  $constraint = "r";
  $fuse_type = "fused_arith_logical";
  push (@outer_ops, @addsub);
  push (@outer_ops, ( "rsubf" ));
   }
   $c4 = "${constraint},${constraint},${constraint},${constraint}";
+  $c5 = "${constraint},${constraint},${constraint},wa,${constraint}";
 OUTER: foreach $outer ( @outer_ops ) {
$outer_name = "${vchr}${outer}";
$is_subf = ( $outer eq "subf" );
@@ -263,23 +271,33 @@ sub gen_logical_addsubf
  $bc = ""; if ( $both_commute ) { $bc = "%"; }
  $inner_arg0 = "(match_operand:${mode} 0 \"${pred}\" \"${c4}\")";
  $inner_arg1 = "(match_operand:${mode} 1 \"${pred}\" \"${bc}${c4}\")";
+ $vect_inner_arg0 = "(match_operand:${mode} 0 \"${vect_pred}\" 
\"${c5}\")";
+ $vect_inner_arg1 = "(match_operand:${mode} 1 \"${vect_pred}\" 
\"${bc}${c5}\")";
  if ( ($inner_comp & 1) == 1 ) {
  $inner_arg0 = "(not:${mode} $inner_arg0)";
+ $vect_inner_arg0 = "(not:${mode} $vect_inner_arg0)";
  }
  if ( ($inn

[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:2f541cae69f67d9e61b6bafa6f4847fa27fdaf05

commit 2f541cae69f67d9e61b6bafa6f4847fa27fdaf05
Author: Michael Meissner 
Date:   Tue Jun 10 18:24:33 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/testsuite/

PR target/117251
* gcc.target/powerpc/p10-vector-fused-1.c: New test.
* gcc.target/powerpc/p10-vector-fused-2.c: Likewise.

Diff:
---
 .../gcc.target/powerpc/p10-vector-fused-1.c| 409 +
 .../gcc.target/powerpc/p10-vector-fused-2.c| 936 +
 2 files changed, 1345 insertions(+)

diff --git a/gcc/testsuite/gcc.target/powerpc/p10-vector-fused-1.c 
b/gcc/testsuite/gcc.target/powerpc/p10-vector-fused-1.c
new file mode 100644
index ..28e0874b3454
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/p10-vector-fused-1.c
@@ -0,0 +1,409 @@
+/* { dg-do run } */
+/* { dg-require-effective-target power10_hw } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+/* Generate and check most of the vector logical instruction combinations that
+   may or may not generate xxeval to do a fused operation on power10.  */
+
+#include 
+#include 
+#include 
+
+#ifdef DEBUG
+#include 
+
+static int errors = 0;
+static int tests  = 0;
+#endif
+
+typedef vector unsigned intvector_t;
+typedef unsigned int   scalar_t;
+
+/* Vector logical functions.  */
+static inline vector_t
+vector_and (vector_t x, vector_t y)
+{
+  return x & y;
+}
+
+static inline vector_t
+vector_or (vector_t x, vector_t y)
+{
+  return x | y;
+}
+
+static inline vector_t
+vector_xor (vector_t x, vector_t y)
+{
+  return x ^ y;
+}
+
+static inline vector_t
+vector_andc (vector_t x, vector_t y)
+{
+  return x & ~y;
+}
+
+static inline vector_t
+vector_orc (vector_t x, vector_t y)
+{
+  return x | ~y;
+}
+
+static inline vector_t
+vector_nand (vector_t x, vector_t y)
+{
+  return ~(x & y);
+}
+
+static inline vector_t
+vector_nor (vector_t x, vector_t y)
+{
+  return ~(x | y);
+}
+
+static inline vector_t
+vector_eqv (vector_t x, vector_t y)
+{
+  return ~(x ^ y);
+}
+
+/* Scalar logical functions.  */
+static inline scalar_t
+scalar_and (scalar_t x, scalar_t y)
+{
+  return x & y;
+}
+
+static inline scalar_t
+scalar_or (scalar_t x, scalar_t y)
+{
+  return x | y;
+}
+
+static inline scalar_t
+scalar_xor (scalar_t x, scalar_t y)
+{
+  return x ^ y;
+}
+
+static inline scalar_t
+scalar_andc (scalar_t x, scalar_t y)
+{
+  return x & ~y;
+}
+
+static inline scalar_t
+scalar_orc (scalar_t x, scalar_t y)
+{
+  return x | ~y;
+}
+
+static inline scalar_t
+scalar_nand (scalar_t x, scalar_t y)
+{
+  return ~(x & y);
+}
+
+static inline scalar_t
+scalar_nor (scalar_t x, scalar_t y)
+{
+  return ~(x | y);
+}
+
+static inline scalar_t
+scalar_eqv (scalar_t x, scalar_t y)
+{
+  return ~(x ^ y);
+}
+
+
+/*
+ * Generate one function for each combination that we are checking.  Do 4
+ * operations:
+ *
+ * Use FPR regs that should generate either XXEVAL or XXL* insns;
+ * Use Altivec registers than may generated fused V* insns;
+ * Use VSX registers, insure fusing it not done via asm; (and)
+ * Use GPR registers on scalar operations.
+ */
+
+#ifdef DEBUG
+#define TRACE(INNER, OUTER)\
+  do { \
+tests++;   \
+printf ("%s_%s\n", INNER, OUTER);  \
+fflush (stdout);   \
+  } while (0)  \
+
+#define FAILED(INNER, OUTER)   \
+  do { \
+errors++;  \
+printf ("%s_%s failed\n", INNER, OUTER);   \
+fflush (stdout);   \
+  } while (0)  \
+
+#else
+#define TRACE(INNER, OUTER)
+#define FAILED(INNER, OUTER)   abort ()
+#endif
+
+#define FUSED_FUNC(INNER, OUTER)   \
+static void\
+INNER ## _ ## OUTER (vector_t a, vector_t b, vector_t c)   \
+{  \
+  vector_t f_a, f_b, f_c, f_r, f_t;\
+  vector_t v_a, v_b, v_c, v_r, v_t;\
+  vector_t w_a, w_b, w_c, w_r, w_t;\
+  scalar_t s_a, s_b, s_c, s_r, s_t;\
+   \
+  TRACE (#INNER, #OUTER);  \
+  

[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:a5bdfca7a952be8e1dd86f5e5a026197bf08fd98

commit a5bdfca7a952be8e1dd86f5e5a026197bf08fd98
Author: Michael Meissner 
Date:   Tue Jun 10 16:34:00 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector andc => xor fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index fccea39d0aae..6e5c88b81b44 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2933,20 +2933,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vandc -> vxor
 (define_insn "*fuse_vandc_vxor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(xor:VM (and:VM (not:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v"))
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v"))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(xor:VM (and:VM (not:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v"))
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v"))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vandc %3,%1,%0\;vxor %3,%3,%2
vandc %3,%1,%0\;vxor %3,%3,%2
vandc %3,%1,%0\;vxor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,45
vandc %4,%1,%0\;vxor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector veqv -> vxor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index ab714b10f622..d15208a4ad3e 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -227,6 +227,7 @@ sub gen_logical_addsubf
   "vnand_vnor"  =>  16,
   "vand_vxor"   =>  30,
   "vand_vor"=>  31,
+  "vandc_vxor"  =>  45,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:a033ea0c0d3822b7b2f43b244d17ae3c04173e61

commit a033ea0c0d3822b7b2f43b244d17ae3c04173e61
Author: Michael Meissner 
Date:   Tue Jun 10 18:19:39 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector and => nand fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 129f7dfb26ed..61d66129da65 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2336,20 +2336,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vand -> vnand
 (define_insn "*fuse_vand_vnand"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(ior:VM (not:VM (and:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v")
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v")))
- (not:VM (match_operand:VM 2 "altivec_register_operand" 
"v,v,v,v"
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(ior:VM (not:VM (and:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v")
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v")))
+ (not:VM (match_operand:VM 2 "vector_fusion_operand" 
"v,v,v,wa,v"
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vand %3,%1,%0\;vnand %3,%3,%2
vand %3,%1,%0\;vnand %3,%3,%2
vand %3,%1,%0\;vnand %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,254
vand %4,%1,%0\;vnand %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vandc -> vnand
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 1d31c242042e..9261dd369340 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -258,6 +258,7 @@ sub gen_logical_addsubf
   "vor_vnand"   => 248,
   "vxor_vnand"  => 249,
   "vandc_vnand" => 253,
+  "vand_vnand"  => 254,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:303ab25c03790bd8cc2be68240371e41e5e75c7f

commit 303ab25c03790bd8cc2be68240371e41e5e75c7f
Author: Michael Meissner 
Date:   Tue Jun 10 17:36:30 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector orc => or fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 3d7e6502b027..f6dc26e9c1f2 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2762,20 +2762,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vorc -> vor
 (define_insn "*fuse_vorc_vor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(ior:VM (ior:VM (not:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v"))
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v"))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(ior:VM (ior:VM (not:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v"))
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v"))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vorc %3,%1,%0\;vor %3,%3,%2
vorc %3,%1,%0\;vor %3,%3,%2
vorc %3,%1,%0\;vor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,191
vorc %4,%1,%0\;vor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vxor -> vor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 15f931baad33..62f2b9e36d89 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -245,6 +245,7 @@ sub gen_logical_addsubf
   "veqv_vxor"   => 150,
   "veqv_vor"=> 159,
   "vorc_vxor"   => 180,
+  "vorc_vor"=> 191,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:8228e5b4ab9a58e16b692a4f4ca4fa83c688c222

commit 8228e5b4ab9a58e16b692a4f4ca4fa83c688c222
Author: Michael Meissner 
Date:   Tue Jun 10 17:45:02 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector andc => eqv fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index dd8401d48228..e3d9f7376a8d 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2204,20 +2204,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vandc -> veqv
 (define_insn "*fuse_vandc_veqv"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(not:VM (xor:VM (and:VM (not:VM (match_operand:VM 0 
"altivec_register_operand" "v,v,v,v"))
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v"))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v"
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(not:VM (xor:VM (and:VM (not:VM (match_operand:VM 0 
"vector_fusion_operand" "v,v,v,wa,v"))
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v"))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v"
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vandc %3,%1,%0\;veqv %3,%3,%2
vandc %3,%1,%0\;veqv %3,%3,%2
vandc %3,%1,%0\;veqv %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,210
vandc %4,%1,%0\;veqv %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector veqv -> veqv
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index d89e78d4da03..3a603eb09675 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -247,6 +247,7 @@ sub gen_logical_addsubf
   "vorc_vxor"   => 180,
   "vorc_vor"=> 191,
   "vandc_vnor"  => 208,
+  "vandc_veqv"  => 210,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:2b8a8f1a9e461de365f0de90374398833adb8b74

commit 2b8a8f1a9e461de365f0de90374398833adb8b74
Author: Michael Meissner 
Date:   Tue Jun 10 17:55:21 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector nand => or fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index e6d13b38415a..ba3a5a52b990 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2711,20 +2711,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vnand -> vor
 (define_insn "*fuse_vnand_vor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(ior:VM (ior:VM (not:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v"))
-  (not:VM (match_operand:VM 1 
"altivec_register_operand" "v,v,v,v")))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(ior:VM (ior:VM (not:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v"))
+  (not:VM (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v")))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vnand %3,%1,%0\;vor %3,%3,%2
vnand %3,%1,%0\;vor %3,%3,%2
vnand %3,%1,%0\;vor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,239
vnand %4,%1,%0\;vor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vnor -> vor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 94eae471c64b..54699d199fc5 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -250,6 +250,7 @@ sub gen_logical_addsubf
   "vandc_veqv"  => 210,
   "vand_vnor"   => 224,
   "vnand_vxor"  => 225,
+  "vnand_vor"   => 239,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:1b6089173e851ffae80699d949666f4c27032e1c

commit 1b6089173e851ffae80699d949666f4c27032e1c
Author: Michael Meissner 
Date:   Tue Jun 10 16:20:16 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector/vector nand/nor fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index f70422616ffd..c8a27a9e5471 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2528,20 +2528,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vnand -> vnor
 (define_insn "*fuse_vnand_vnor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(and:VM (not:VM (ior:VM (not:VM (match_operand:VM 0 
"altivec_register_operand" "v,v,v,v"))
-  (not:VM (match_operand:VM 1 
"altivec_register_operand" "v,v,v,v"
- (not:VM (match_operand:VM 2 "altivec_register_operand" 
"v,v,v,v"
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(and:VM (not:VM (ior:VM (not:VM (match_operand:VM 0 
"vector_fusion_operand" "v,v,v,wa,v"))
+  (not:VM (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v"
+ (not:VM (match_operand:VM 2 "vector_fusion_operand" 
"v,v,v,wa,v"
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vnand %3,%1,%0\;vnor %3,%3,%2
vnand %3,%1,%0\;vnor %3,%3,%2
vnand %3,%1,%0\;vnor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,16
vnand %4,%1,%0\;vnor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vnor -> vnor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 5beabe530a67..078bc6ca0dab 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -224,6 +224,7 @@ sub gen_logical_addsubf
   "vorc_vand"   =>  11,
   "vandc_vandc" =>  13,
   "vnand_vand"  =>  14,
+  "vnand_vnor"  =>  16,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:50059b8faffe92785770f269feca74119ddbd249

commit 50059b8faffe92785770f269feca74119ddbd249
Author: Michael Meissner 
Date:   Tue Jun 10 16:41:51 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector orc => eqv fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index f45e65f0217c..f84d0aee5d79 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2294,20 +2294,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vorc -> veqv
 (define_insn "*fuse_vorc_veqv"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(not:VM (xor:VM (ior:VM (not:VM (match_operand:VM 0 
"altivec_register_operand" "v,v,v,v"))
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v"))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v"
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(not:VM (xor:VM (ior:VM (not:VM (match_operand:VM 0 
"vector_fusion_operand" "v,v,v,wa,v"))
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v"))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v"
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vorc %3,%1,%0\;veqv %3,%3,%2
vorc %3,%1,%0\;veqv %3,%3,%2
vorc %3,%1,%0\;veqv %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,75
vorc %4,%1,%0\;veqv %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vxor -> veqv
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 720e8d440c2d..8ba1aa081f75 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -230,6 +230,7 @@ sub gen_logical_addsubf
   "vandc_vxor"  =>  45,
   "vandc_vor"   =>  47,
   "vorc_vnor"   =>  64,
+  "vorc_veqv"   =>  75,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:ae5fc7bc755592194e73535df4f28dddf58cf517

commit ae5fc7bc755592194e73535df4f28dddf58cf517
Author: Michael Meissner 
Date:   Tue Jun 10 18:10:38 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector or => nand fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 01b7fda17ecc..39b586918c17 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2435,20 +2435,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vor -> vnand
 (define_insn "*fuse_vor_vnand"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(ior:VM (not:VM (ior:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v")
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v")))
- (not:VM (match_operand:VM 2 "altivec_register_operand" 
"v,v,v,v"
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(ior:VM (not:VM (ior:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v")
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v")))
+ (not:VM (match_operand:VM 2 "vector_fusion_operand" 
"v,v,v,wa,v"
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vor %3,%1,%0\;vnand %3,%3,%2
vor %3,%1,%0\;vnand %3,%3,%2
vor %3,%1,%0\;vnand %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,248
vor %4,%1,%0\;vnand %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vorc -> vnand
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index d4965b6df864..86bca81286ca 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -255,6 +255,7 @@ sub gen_logical_addsubf
   "vorc_vnand"  => 244,
   "veqv_vnand"  => 246,
   "vnor_vnand"  => 247,
+  "vor_vnand"   => 248,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:b144edb81dc7fac54df459d2a5dfbd36ef0daf51

commit b144edb81dc7fac54df459d2a5dfbd36ef0daf51
Author: Michael Meissner 
Date:   Tue Jun 10 18:13:21 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector xor => nand fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 39b586918c17..e0f9ac17659a 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2477,20 +2477,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vxor -> vnand
 (define_insn "*fuse_vxor_vnand"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(ior:VM (not:VM (xor:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v")
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v")))
- (not:VM (match_operand:VM 2 "altivec_register_operand" 
"v,v,v,v"
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(ior:VM (not:VM (xor:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v")
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v")))
+ (not:VM (match_operand:VM 2 "vector_fusion_operand" 
"v,v,v,wa,v"
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vxor %3,%1,%0\;vnand %3,%3,%2
vxor %3,%1,%0\;vnand %3,%3,%2
vxor %3,%1,%0\;vnand %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,249
vxor %4,%1,%0\;vnand %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vand -> vnor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 86bca81286ca..5d22a0732df6 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -256,6 +256,7 @@ sub gen_logical_addsubf
   "veqv_vnand"  => 246,
   "vnor_vnand"  => 247,
   "vor_vnand"   => 248,
+  "vxor_vnand"  => 249,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:fa43d1cdb8be87fe0f2e6259ce17d8803806115a

commit fa43d1cdb8be87fe0f2e6259ce17d8803806115a
Author: Michael Meissner 
Date:   Tue Jun 10 16:55:47 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector nor => nor fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 1d4b3c970c7f..032c87ac5765 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2555,20 +2555,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vnor -> vnor
 (define_insn "*fuse_vnor_vnor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(and:VM (not:VM (and:VM (not:VM (match_operand:VM 0 
"altivec_register_operand" "v,v,v,v"))
-  (not:VM (match_operand:VM 1 
"altivec_register_operand" "v,v,v,v"
- (not:VM (match_operand:VM 2 "altivec_register_operand" 
"v,v,v,v"
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(and:VM (not:VM (and:VM (not:VM (match_operand:VM 0 
"vector_fusion_operand" "v,v,v,wa,v"))
+  (not:VM (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v"
+ (not:VM (match_operand:VM 2 "vector_fusion_operand" 
"v,v,v,wa,v"
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vnor %3,%1,%0\;vnor %3,%3,%2
vnor %3,%1,%0\;vnor %3,%3,%2
vnor %3,%1,%0\;vnor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,112
vnor %4,%1,%0\;vnor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vor -> vnor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 4ec38beccb9c..6af4c5d7a182 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -235,6 +235,7 @@ sub gen_logical_addsubf
   "veqv_vnor"   =>  96,
   "vxor_vxor"   => 105,
   "vxor_vor"=> 111,
+  "vnor_vnor"   => 112,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:c58f75947d43a31a787e4028e15e37f058fa121c

commit c58f75947d43a31a787e4028e15e37f058fa121c
Author: Michael Meissner 
Date:   Tue Jun 10 18:16:42 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector andc => nand fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index e0f9ac17659a..129f7dfb26ed 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2354,20 +2354,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vandc -> vnand
 (define_insn "*fuse_vandc_vnand"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(ior:VM (not:VM (and:VM (not:VM (match_operand:VM 0 
"altivec_register_operand" "v,v,v,v"))
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v")))
- (not:VM (match_operand:VM 2 "altivec_register_operand" 
"v,v,v,v"
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(ior:VM (not:VM (and:VM (not:VM (match_operand:VM 0 
"vector_fusion_operand" "v,v,v,wa,v"))
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v")))
+ (not:VM (match_operand:VM 2 "vector_fusion_operand" 
"v,v,v,wa,v"
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vandc %3,%1,%0\;vnand %3,%3,%2
vandc %3,%1,%0\;vnand %3,%3,%2
vandc %3,%1,%0\;vnand %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,253
vandc %4,%1,%0\;vnand %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector veqv -> vnand
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 5d22a0732df6..1d31c242042e 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -257,6 +257,7 @@ sub gen_logical_addsubf
   "vnor_vnand"  => 247,
   "vor_vnand"   => 248,
   "vxor_vnand"  => 249,
+  "vandc_vnand" => 253,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:c25a0ce44f75ec67528a2d855ef5229abfe4cf6d

commit c25a0ce44f75ec67528a2d855ef5229abfe4cf6d
Author: Michael Meissner 
Date:   Tue Jun 10 15:56:02 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector/vector andc/and fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index d24837d68d83..b9590b6d1104 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -1892,20 +1892,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vandc -> vand
 (define_insn "*fuse_vandc_vand"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(and:VM (and:VM (not:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v"))
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v"))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(and:VM (and:VM (not:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v"))
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v"))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vandc %3,%1,%0\;vand %3,%3,%2
vandc %3,%1,%0\;vand %3,%3,%2
vandc %3,%1,%0\;vand %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,2
vandc %4,%1,%0\;vand %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector veqv -> vand
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 351a4d914a4a..23adf98c4056 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -216,6 +216,7 @@ sub gen_logical_addsubf
 
 my %xxeval_fusions = (
   "vand_vand"   =>   1,
+  "vandc_vand"  =>   2,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:8bfffbaaf94907d12c535486cd9d8a3a53a1aea9

commit 8bfffbaaf94907d12c535486cd9d8a3a53a1aea9
Author: Michael Meissner 
Date:   Tue Jun 10 16:14:34 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector/vector andc/andc fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index e27f05f85f12..810d97963fb9 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2054,20 +2054,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vandc -> vandc
 (define_insn "*fuse_vandc_vandc"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(and:VM (and:VM (not:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v"))
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v"))
- (not:VM (match_operand:VM 2 "altivec_register_operand" 
"v,v,v,v"
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(and:VM (and:VM (not:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v"))
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v"))
+ (not:VM (match_operand:VM 2 "vector_fusion_operand" 
"v,v,v,wa,v"
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vandc %3,%1,%0\;vandc %3,%3,%2
vandc %3,%1,%0\;vandc %3,%3,%2
vandc %3,%1,%0\;vandc %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,13
vandc %4,%1,%0\;vandc %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector veqv -> vandc
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index a3cc8b121eab..929257d6c03e 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -222,6 +222,7 @@ sub gen_logical_addsubf
   "vnor_vand"   =>   8,
   "veqv_vand"   =>   9,
   "vorc_vand"   =>  11,
+  "vandc_vandc" =>  13,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:728f505faf021f60c6157433af58bf1c7499

commit 728f505faf021f60c6157433af58bf1c7499
Author: Michael Meissner 
Date:   Tue Jun 10 16:23:42 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector/vector and/xor fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index c8a27a9e5471..789a4d592419 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2909,20 +2909,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vand -> vxor
 (define_insn "*fuse_vand_vxor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(xor:VM (and:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v")
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v"))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(xor:VM (and:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v")
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v"))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vand %3,%1,%0\;vxor %3,%3,%2
vand %3,%1,%0\;vxor %3,%3,%2
vand %3,%1,%0\;vxor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,30
vand %4,%1,%0\;vxor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vandc -> vxor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 078bc6ca0dab..e6d44d430b3a 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -225,6 +225,7 @@ sub gen_logical_addsubf
   "vandc_vandc" =>  13,
   "vnand_vand"  =>  14,
   "vnand_vnor"  =>  16,
+  "vand_vxor"   =>  30,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:4a0c4551f0f9df5a312141b8a5bff58645f5115f

commit 4a0c4551f0f9df5a312141b8a5bff58645f5115f
Author: Michael Meissner 
Date:   Tue Jun 10 16:11:38 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector/vector orc/and fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index cce179e0c974..e27f05f85f12 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -1994,20 +1994,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vorc -> vand
 (define_insn "*fuse_vorc_vand"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(and:VM (ior:VM (not:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v"))
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v"))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(and:VM (ior:VM (not:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v"))
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v"))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vorc %3,%1,%0\;vand %3,%3,%2
vorc %3,%1,%0\;vand %3,%3,%2
vorc %3,%1,%0\;vand %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,11
vorc %4,%1,%0\;vand %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vxor -> vand
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 268b94089484..a3cc8b121eab 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -221,6 +221,7 @@ sub gen_logical_addsubf
   "vor_vand"=>   7,
   "vnor_vand"   =>   8,
   "veqv_vand"   =>   9,
+  "vorc_vand"   =>  11,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:a75dd22985863e77b051d70f250e939214ba32dc

commit a75dd22985863e77b051d70f250e939214ba32dc
Author: Michael Meissner 
Date:   Tue Jun 10 16:07:51 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector/vector nor/and fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index ed15fccdf760..cce179e0c974 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -1913,20 +1913,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector veqv -> vand
 (define_insn "*fuse_veqv_vand"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(and:VM (not:VM (xor:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v")
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v")))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(and:VM (not:VM (xor:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v")
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v")))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
veqv %3,%1,%0\;vand %3,%3,%2
veqv %3,%1,%0\;vand %3,%3,%2
veqv %3,%1,%0\;vand %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,9
veqv %4,%1,%0\;vand %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vnand -> vand
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 40d62ae8e9c1..268b94089484 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -220,6 +220,7 @@ sub gen_logical_addsubf
   "vxor_vand"   =>   6,
   "vor_vand"=>   7,
   "vnor_vand"   =>   8,
+  "veqv_vand"   =>   9,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:31416ca5a3c12ca137ce74887f2e6bcaed62e4ca

commit 31416ca5a3c12ca137ce74887f2e6bcaed62e4ca
Author: Michael Meissner 
Date:   Tue Jun 10 16:39:17 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector orc => nor fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index ed70ac059dfc..f45e65f0217c 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2585,20 +2585,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vorc -> vnor
 (define_insn "*fuse_vorc_vnor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(and:VM (not:VM (ior:VM (not:VM (match_operand:VM 0 
"altivec_register_operand" "v,v,v,v"))
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v")))
- (not:VM (match_operand:VM 2 "altivec_register_operand" 
"v,v,v,v"
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(and:VM (not:VM (ior:VM (not:VM (match_operand:VM 0 
"vector_fusion_operand" "v,v,v,wa,v"))
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v")))
+ (not:VM (match_operand:VM 2 "vector_fusion_operand" 
"v,v,v,wa,v"
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vorc %3,%1,%0\;vnor %3,%3,%2
vorc %3,%1,%0\;vnor %3,%3,%2
vorc %3,%1,%0\;vnor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,64
vorc %4,%1,%0\;vnor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vxor -> vnor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 69fa544f0317..720e8d440c2d 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -229,6 +229,7 @@ sub gen_logical_addsubf
   "vand_vor"=>  31,
   "vandc_vxor"  =>  45,
   "vandc_vor"   =>  47,
+  "vorc_vnor"   =>  64,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:14010241a92479cbcf0bf6cc5a584065821c9173

commit 14010241a92479cbcf0bf6cc5a584065821c9173
Author: Michael Meissner 
Date:   Tue Jun 10 16:17:16 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector/vector nand/and fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 810d97963fb9..f70422616ffd 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -1934,20 +1934,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vnand -> vand
 (define_insn "*fuse_vnand_vand"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(and:VM (ior:VM (not:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v"))
-  (not:VM (match_operand:VM 1 
"altivec_register_operand" "v,v,v,v")))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(and:VM (ior:VM (not:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v"))
+  (not:VM (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v")))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vnand %3,%1,%0\;vand %3,%3,%2
vnand %3,%1,%0\;vand %3,%3,%2
vnand %3,%1,%0\;vand %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,14
vnand %4,%1,%0\;vand %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vnor -> vand
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 929257d6c03e..5beabe530a67 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -223,6 +223,7 @@ sub gen_logical_addsubf
   "veqv_vand"   =>   9,
   "vorc_vand"   =>  11,
   "vandc_vandc" =>  13,
+  "vnand_vand"  =>  14,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:5e246c3eea3303bbc3be494765009cc443eda276

commit 5e246c3eea3303bbc3be494765009cc443eda276
Author: Michael Meissner 
Date:   Tue Jun 10 16:36:08 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector andc => or fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 6e5c88b81b44..ed70ac059dfc 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2642,20 +2642,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vandc -> vor
 (define_insn "*fuse_vandc_vor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(ior:VM (and:VM (not:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v"))
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v"))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(ior:VM (and:VM (not:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v"))
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v"))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vandc %3,%1,%0\;vor %3,%3,%2
vandc %3,%1,%0\;vor %3,%3,%2
vandc %3,%1,%0\;vor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,47
vandc %4,%1,%0\;vor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector veqv -> vor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index d15208a4ad3e..69fa544f0317 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -228,6 +228,7 @@ sub gen_logical_addsubf
   "vand_vxor"   =>  30,
   "vand_vor"=>  31,
   "vandc_vxor"  =>  45,
+  "vandc_vor"   =>  47,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:a59034f2a6084d924133668e58db1b2bba2e6a43

commit a59034f2a6084d924133668e58db1b2bba2e6a43
Author: Michael Meissner 
Date:   Tue Jun 10 17:20:17 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector nor => or fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 1f1756dbe63e..66d98f4537e1 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2714,20 +2714,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vnor -> vor
 (define_insn "*fuse_vnor_vor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(ior:VM (and:VM (not:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v"))
-  (not:VM (match_operand:VM 1 
"altivec_register_operand" "v,v,v,v")))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(ior:VM (and:VM (not:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v"))
+  (not:VM (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v")))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vnor %3,%1,%0\;vor %3,%3,%2
vnor %3,%1,%0\;vor %3,%3,%2
vnor %3,%1,%0\;vor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,143
vnor %4,%1,%0\;vor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vor -> vor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 0fea2d6d8482..98b56b788f03 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -240,6 +240,7 @@ sub gen_logical_addsubf
   "vor_vor" => 127,
   "vor_vnor"=> 128,
   "vnor_vxor"   => 135,
+  "vnor_vor"=> 143,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:b3706b848edd2b31abe5ef2698392805303548d2

commit b3706b848edd2b31abe5ef2698392805303548d2
Author: Michael Meissner 
Date:   Tue Jun 10 16:50:19 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector xor => xor fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index e5099178d63d..a848b21bc3e2 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -3059,20 +3059,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vxor -> vxor
 (define_insn "*fuse_vxor_vxor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(xor:VM (xor:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v")
-  (match_operand:VM 1 "altivec_register_operand" 
"%v,v,v,v"))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(xor:VM (xor:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v")
+  (match_operand:VM 1 "vector_fusion_operand" 
"%v,v,v,wa,v"))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vxor %3,%1,%0\;vxor %3,%3,%2
vxor %3,%1,%0\;vxor %3,%3,%2
vxor %3,%1,%0\;vxor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,105
vxor %4,%1,%0\;vxor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; add-add fusion pattern generated by gen_addadd
 (define_insn "*fuse_add_add"
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 79d9eaed7da6..b9ff6c99b95e 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -233,6 +233,7 @@ sub gen_logical_addsubf
   "vorc_veqv"   =>  75,
   "vorc_vorc"   =>  79,
   "veqv_vnor"   =>  96,
+  "vxor_vxor"   => 105,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:20db16cfc7f50b4f218466526dff7a6d01c85cba

commit 20db16cfc7f50b4f218466526dff7a6d01c85cba
Author: Michael Meissner 
Date:   Tue Jun 10 17:28:09 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector eqv => or fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index bb62ae26445a..cb1ad8b4c0cc 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2681,20 +2681,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector veqv -> vor
 (define_insn "*fuse_veqv_vor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(ior:VM (not:VM (xor:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v")
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v")))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(ior:VM (not:VM (xor:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v")
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v")))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
veqv %3,%1,%0\;vor %3,%3,%2
veqv %3,%1,%0\;vor %3,%3,%2
veqv %3,%1,%0\;vor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,159
veqv %4,%1,%0\;vor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vnand -> vor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 726e29c798bc..9400aed267a6 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -243,6 +243,7 @@ sub gen_logical_addsubf
   "vnor_vor"=> 143,
   "vxor_vnor"   => 144,
   "veqv_vxor"   => 150,
+  "veqv_vor"=> 159,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:9a5a4fedf0d84ea698e1e31304cf6a4d8652051c

commit 9a5a4fedf0d84ea698e1e31304cf6a4d8652051c
Author: Michael Meissner 
Date:   Tue Jun 10 17:00:25 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector or => or fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index d1f6a38b618a..c2a2ebf4bfaf 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2729,20 +2729,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vor -> vor
 (define_insn "*fuse_vor_vor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(ior:VM (ior:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v")
-  (match_operand:VM 1 "altivec_register_operand" 
"%v,v,v,v"))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(ior:VM (ior:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v")
+  (match_operand:VM 1 "vector_fusion_operand" 
"%v,v,v,wa,v"))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vor %3,%1,%0\;vor %3,%3,%2
vor %3,%1,%0\;vor %3,%3,%2
vor %3,%1,%0\;vor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,127
vor %4,%1,%0\;vor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vorc -> vor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 97681f37d0fa..9df4c8d6527e 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -237,6 +237,7 @@ sub gen_logical_addsubf
   "vxor_vor"=> 111,
   "vnor_vnor"   => 112,
   "vor_vxor"=> 120,
+  "vor_vor" => 127,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:0d518163697d4fc03874268389223ad567abf13a

commit 0d518163697d4fc03874268389223ad567abf13a
Author: Michael Meissner 
Date:   Tue Jun 10 16:47:39 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector eqv => nor fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 486aa813575d..e5099178d63d 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2513,20 +2513,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector veqv -> vnor
 (define_insn "*fuse_veqv_vnor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(and:VM (not:VM (not:VM (xor:VM (match_operand:VM 0 
"altivec_register_operand" "v,v,v,v")
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v"
- (not:VM (match_operand:VM 2 "altivec_register_operand" 
"v,v,v,v"
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(and:VM (not:VM (not:VM (xor:VM (match_operand:VM 0 
"vector_fusion_operand" "v,v,v,wa,v")
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v"
+ (not:VM (match_operand:VM 2 "vector_fusion_operand" 
"v,v,v,wa,v"
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
veqv %3,%1,%0\;vnor %3,%3,%2
veqv %3,%1,%0\;vnor %3,%3,%2
veqv %3,%1,%0\;vnor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,96
veqv %4,%1,%0\;vnor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vnand -> vnor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 8f60fe76c87b..79d9eaed7da6 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -232,6 +232,7 @@ sub gen_logical_addsubf
   "vorc_vnor"   =>  64,
   "vorc_veqv"   =>  75,
   "vorc_vorc"   =>  79,
+  "veqv_vnor"   =>  96,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:0dfbbfadd4b128ca440e3b38c0abbb0a6f1bf35e

commit 0dfbbfadd4b128ca440e3b38c0abbb0a6f1bf35e
Author: Michael Meissner 
Date:   Tue Jun 10 16:58:06 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector or => xor fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 032c87ac5765..d1f6a38b618a 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -3029,20 +3029,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vor -> vxor
 (define_insn "*fuse_vor_vxor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(xor:VM (ior:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v")
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v"))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(xor:VM (ior:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v")
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v"))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vor %3,%1,%0\;vxor %3,%3,%2
vor %3,%1,%0\;vxor %3,%3,%2
vor %3,%1,%0\;vxor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,120
vor %4,%1,%0\;vxor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vorc -> vxor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 6af4c5d7a182..97681f37d0fa 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -236,6 +236,7 @@ sub gen_logical_addsubf
   "vxor_vxor"   => 105,
   "vxor_vor"=> 111,
   "vnor_vnor"   => 112,
+  "vor_vxor"=> 120,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:29fb0c6566c9ef1f149d1084dc2581c872649b12

commit 29fb0c6566c9ef1f149d1084dc2581c872649b12
Author: Michael Meissner 
Date:   Tue Jun 10 17:17:48 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector nor => xor fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index c55e9d4abd67..1f1756dbe63e 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -3017,20 +3017,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vnor -> vxor
 (define_insn "*fuse_vnor_vxor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(xor:VM (and:VM (not:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v"))
-  (not:VM (match_operand:VM 1 
"altivec_register_operand" "v,v,v,v")))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(xor:VM (and:VM (not:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v"))
+  (not:VM (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v")))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vnor %3,%1,%0\;vxor %3,%3,%2
vnor %3,%1,%0\;vxor %3,%3,%2
vnor %3,%1,%0\;vxor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,135
vnor %4,%1,%0\;vxor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vor -> vxor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 58f900640bef..0fea2d6d8482 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -239,6 +239,7 @@ sub gen_logical_addsubf
   "vor_vxor"=> 120,
   "vor_vor" => 127,
   "vor_vnor"=> 128,
+  "vnor_vxor"   => 135,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:83297e5ec6c3b8713c10c628aa7acc2a7cfda2c3

commit 83297e5ec6c3b8713c10c628aa7acc2a7cfda2c3
Author: Michael Meissner 
Date:   Tue Jun 10 16:26:51 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector and => or fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 789a4d592419..fccea39d0aae 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2621,20 +2621,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vand -> vor
 (define_insn "*fuse_vand_vor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(ior:VM (and:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v")
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v"))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(ior:VM (and:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v")
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v"))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vand %3,%1,%0\;vor %3,%3,%2
vand %3,%1,%0\;vor %3,%3,%2
vand %3,%1,%0\;vor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,31
vand %4,%1,%0\;vor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vandc -> vor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index e6d44d430b3a..ab714b10f622 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -226,6 +226,7 @@ sub gen_logical_addsubf
   "vnand_vand"  =>  14,
   "vnand_vnor"  =>  16,
   "vand_vxor"   =>  30,
+  "vand_vor"=>  31,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:c5ffa5bd175214d557b2bec0ee3ea6bd64745ac7

commit c5ffa5bd175214d557b2bec0ee3ea6bd64745ac7
Author: Michael Meissner 
Date:   Tue Jun 10 16:53:16 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector xor => or fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index a848b21bc3e2..1d4b3c970c7f 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2762,20 +2762,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vxor -> vor
 (define_insn "*fuse_vxor_vor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(ior:VM (xor:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v")
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v"))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(ior:VM (xor:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v")
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v"))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vxor %3,%1,%0\;vor %3,%3,%2
vxor %3,%1,%0\;vor %3,%3,%2
vxor %3,%1,%0\;vor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,111
vxor %4,%1,%0\;vor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vand -> vorc
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index b9ff6c99b95e..4ec38beccb9c 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -234,6 +234,7 @@ sub gen_logical_addsubf
   "vorc_vorc"   =>  79,
   "veqv_vnor"   =>  96,
   "vxor_vxor"   => 105,
+  "vxor_vor"=> 111,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:c29e8af614954a24935024b7662144e94a313b94

commit c29e8af614954a24935024b7662144e94a313b94
Author: Michael Meissner 
Date:   Tue Jun 10 18:04:39 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector eqv => nand fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 96d8951049c9..c1be0e5ff8f1 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2372,20 +2372,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector veqv -> vnand
 (define_insn "*fuse_veqv_vnand"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(ior:VM (not:VM (not:VM (xor:VM (match_operand:VM 0 
"altivec_register_operand" "v,v,v,v")
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v"
- (not:VM (match_operand:VM 2 "altivec_register_operand" 
"v,v,v,v"
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(ior:VM (not:VM (not:VM (xor:VM (match_operand:VM 0 
"vector_fusion_operand" "v,v,v,wa,v")
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v"
+ (not:VM (match_operand:VM 2 "vector_fusion_operand" 
"v,v,v,wa,v"
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
veqv %3,%1,%0\;vnand %3,%3,%2
veqv %3,%1,%0\;vnand %3,%3,%2
veqv %3,%1,%0\;vnand %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,246
veqv %4,%1,%0\;vnand %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vnand -> vnand
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 77d3e999eb93..4c70237d2d27 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -253,6 +253,7 @@ sub gen_logical_addsubf
   "vnand_vor"   => 239,
   "vnand_vnand" => 241,
   "vorc_vnand"  => 244,
+  "veqv_vnand"  => 246,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:41f7160530b5590d445c0eba8edc13349fca0fbf

commit 41f7160530b5590d445c0eba8edc13349fca0fbf
Author: Michael Meissner 
Date:   Tue Jun 10 17:52:13 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector nand => xor fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 68b52d4f5893..e6d13b38415a 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -3023,20 +3023,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vnand -> vxor
 (define_insn "*fuse_vnand_vxor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(xor:VM (ior:VM (not:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v"))
-  (not:VM (match_operand:VM 1 
"altivec_register_operand" "v,v,v,v")))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(xor:VM (ior:VM (not:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v"))
+  (not:VM (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v")))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vnand %3,%1,%0\;vxor %3,%3,%2
vnand %3,%1,%0\;vxor %3,%3,%2
vnand %3,%1,%0\;vxor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,225
vnand %4,%1,%0\;vxor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vnor -> vxor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 56e5d96ec5f3..94eae471c64b 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -249,6 +249,7 @@ sub gen_logical_addsubf
   "vandc_vnor"  => 208,
   "vandc_veqv"  => 210,
   "vand_vnor"   => 224,
+  "vnand_vxor"  => 225,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:f656cd986db4f0d02a93bde8ca178b566face611

commit f656cd986db4f0d02a93bde8ca178b566face611
Author: Michael Meissner 
Date:   Tue Jun 10 16:44:55 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector orc => orc fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index f84d0aee5d79..486aa813575d 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2885,20 +2885,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vorc -> vorc
 (define_insn "*fuse_vorc_vorc"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(ior:VM (ior:VM (not:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v"))
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v"))
- (not:VM (match_operand:VM 2 "altivec_register_operand" 
"v,v,v,v"
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(ior:VM (ior:VM (not:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v"))
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v"))
+ (not:VM (match_operand:VM 2 "vector_fusion_operand" 
"v,v,v,wa,v"
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vorc %3,%1,%0\;vorc %3,%3,%2
vorc %3,%1,%0\;vorc %3,%3,%2
vorc %3,%1,%0\;vorc %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,79
vorc %4,%1,%0\;vorc %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vxor -> vorc
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 8ba1aa081f75..8f60fe76c87b 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -231,6 +231,7 @@ sub gen_logical_addsubf
   "vandc_vor"   =>  47,
   "vorc_vnor"   =>  64,
   "vorc_veqv"   =>  75,
+  "vorc_vorc"   =>  79,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:2b9d7bfbad695f60f0dbc8afeead50fe309cc761

commit 2b9d7bfbad695f60f0dbc8afeead50fe309cc761
Author: Michael Meissner 
Date:   Tue Jun 10 17:25:30 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector eqv => xor fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index e5ea37c567d6..bb62ae26445a 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2987,20 +2987,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector veqv -> vxor
 (define_insn "*fuse_veqv_vxor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(xor:VM (not:VM (xor:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v")
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v")))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(xor:VM (not:VM (xor:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v")
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v")))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
veqv %3,%1,%0\;vxor %3,%3,%2
veqv %3,%1,%0\;vxor %3,%3,%2
veqv %3,%1,%0\;vxor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,150
veqv %4,%1,%0\;vxor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vnand -> vxor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index d713d10a1dbc..726e29c798bc 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -242,6 +242,7 @@ sub gen_logical_addsubf
   "vnor_vxor"   => 135,
   "vnor_vor"=> 143,
   "vxor_vnor"   => 144,
+  "veqv_vxor"   => 150,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:a55729c0c364eb53b14d70af25b13f20f88c6bbd

commit a55729c0c364eb53b14d70af25b13f20f88c6bbd
Author: Michael Meissner 
Date:   Tue Jun 10 17:48:14 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector and => nor fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index e3d9f7376a8d..68b52d4f5893 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2480,20 +2480,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vand -> vnor
 (define_insn "*fuse_vand_vnor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(and:VM (not:VM (and:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v")
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v")))
- (not:VM (match_operand:VM 2 "altivec_register_operand" 
"v,v,v,v"
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(and:VM (not:VM (and:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v")
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v")))
+ (not:VM (match_operand:VM 2 "vector_fusion_operand" 
"v,v,v,wa,v"
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vand %3,%1,%0\;vnor %3,%3,%2
vand %3,%1,%0\;vnor %3,%3,%2
vand %3,%1,%0\;vnor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,224
vand %4,%1,%0\;vnor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vandc -> vnor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 3a603eb09675..56e5d96ec5f3 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -248,6 +248,7 @@ sub gen_logical_addsubf
   "vorc_vor"=> 191,
   "vandc_vnor"  => 208,
   "vandc_veqv"  => 210,
+  "vand_vnor"   => 224,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:dbc3e4ec82fcfcf17f12065b7b7e71cca3af9c19

commit dbc3e4ec82fcfcf17f12065b7b7e71cca3af9c19
Author: Michael Meissner 
Date:   Tue Jun 10 17:34:06 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector orc => xor fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index cb1ad8b4c0cc..3d7e6502b027 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -3071,20 +3071,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vorc -> vxor
 (define_insn "*fuse_vorc_vxor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(xor:VM (ior:VM (not:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v"))
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v"))
- (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(xor:VM (ior:VM (not:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v"))
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v"))
+ (match_operand:VM 2 "vector_fusion_operand" "v,v,v,wa,v")))
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vorc %3,%1,%0\;vxor %3,%3,%2
vorc %3,%1,%0\;vxor %3,%3,%2
vorc %3,%1,%0\;vxor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,180
vorc %4,%1,%0\;vxor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vxor -> vxor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 9400aed267a6..15f931baad33 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -244,6 +244,7 @@ sub gen_logical_addsubf
   "vxor_vnor"   => 144,
   "veqv_vxor"   => 150,
   "veqv_vor"=> 159,
+  "vorc_vxor"   => 180,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:1eee925a0672077c5d0518f58b1723c2beea16f0

commit 1eee925a0672077c5d0518f58b1723c2beea16f0
Author: Michael Meissner 
Date:   Tue Jun 10 17:58:15 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector nand => nand fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index ba3a5a52b990..241b8a494fb1 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2390,20 +2390,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vnand -> vnand
 (define_insn "*fuse_vnand_vnand"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(ior:VM (not:VM (ior:VM (not:VM (match_operand:VM 0 
"altivec_register_operand" "v,v,v,v"))
-  (not:VM (match_operand:VM 1 
"altivec_register_operand" "v,v,v,v"
- (not:VM (match_operand:VM 2 "altivec_register_operand" 
"v,v,v,v"
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(ior:VM (not:VM (ior:VM (not:VM (match_operand:VM 0 
"vector_fusion_operand" "v,v,v,wa,v"))
+  (not:VM (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v"
+ (not:VM (match_operand:VM 2 "vector_fusion_operand" 
"v,v,v,wa,v"
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vnand %3,%1,%0\;vnand %3,%3,%2
vnand %3,%1,%0\;vnand %3,%3,%2
vnand %3,%1,%0\;vnand %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,241
vnand %4,%1,%0\;vnand %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vnor -> vnand
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 54699d199fc5..728a447c65a9 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -251,6 +251,7 @@ sub gen_logical_addsubf
   "vand_vnor"   => 224,
   "vnand_vxor"  => 225,
   "vnand_vor"   => 239,
+  "vnand_vnand" => 241,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:539aa4e4b24ae01bdf157cd10feb7dc9ce63c5ca

commit 539aa4e4b24ae01bdf157cd10feb7dc9ce63c5ca
Author: Michael Meissner 
Date:   Tue Jun 10 17:22:37 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector xor => nor fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 66d98f4537e1..e5ea37c567d6 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2618,20 +2618,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vxor -> vnor
 (define_insn "*fuse_vxor_vnor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(and:VM (not:VM (xor:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v")
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v")))
- (not:VM (match_operand:VM 2 "altivec_register_operand" 
"v,v,v,v"
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(and:VM (not:VM (xor:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v")
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v")))
+ (not:VM (match_operand:VM 2 "vector_fusion_operand" 
"v,v,v,wa,v"
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vxor %3,%1,%0\;vnor %3,%3,%2
vxor %3,%1,%0\;vnor %3,%3,%2
vxor %3,%1,%0\;vnor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,144
vxor %4,%1,%0\;vnor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vand -> vor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 98b56b788f03..d713d10a1dbc 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -241,6 +241,7 @@ sub gen_logical_addsubf
   "vor_vnor"=> 128,
   "vnor_vxor"   => 135,
   "vnor_vor"=> 143,
+  "vxor_vnor"   => 144,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:d8dc55ea7482cad5d7e776d3b080c233672e

commit d8dc55ea7482cad5d7e776d3b080c233672e
Author: Michael Meissner 
Date:   Tue Jun 10 17:40:41 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector andc => nor fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index f6dc26e9c1f2..dd8401d48228 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2495,20 +2495,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vandc -> vnor
 (define_insn "*fuse_vandc_vnor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(and:VM (not:VM (and:VM (not:VM (match_operand:VM 0 
"altivec_register_operand" "v,v,v,v"))
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v")))
- (not:VM (match_operand:VM 2 "altivec_register_operand" 
"v,v,v,v"
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(and:VM (not:VM (and:VM (not:VM (match_operand:VM 0 
"vector_fusion_operand" "v,v,v,wa,v"))
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v")))
+ (not:VM (match_operand:VM 2 "vector_fusion_operand" 
"v,v,v,wa,v"
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vandc %3,%1,%0\;vnor %3,%3,%2
vandc %3,%1,%0\;vnor %3,%3,%2
vandc %3,%1,%0\;vnor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,208
vandc %4,%1,%0\;vnor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector veqv -> vnor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 62f2b9e36d89..d89e78d4da03 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -246,6 +246,7 @@ sub gen_logical_addsubf
   "veqv_vor"=> 159,
   "vorc_vxor"   => 180,
   "vorc_vor"=> 191,
+  "vandc_vnor"  => 208,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:71bee51da51e6cd1ce25dd48c69c2aaf8840791b

commit 71bee51da51e6cd1ce25dd48c69c2aaf8840791b
Author: Michael Meissner 
Date:   Tue Jun 10 18:07:26 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector nor => nand fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index c1be0e5ff8f1..01b7fda17ecc 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2414,20 +2414,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vnor -> vnand
 (define_insn "*fuse_vnor_vnand"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(ior:VM (not:VM (and:VM (not:VM (match_operand:VM 0 
"altivec_register_operand" "v,v,v,v"))
-  (not:VM (match_operand:VM 1 
"altivec_register_operand" "v,v,v,v"
- (not:VM (match_operand:VM 2 "altivec_register_operand" 
"v,v,v,v"
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(ior:VM (not:VM (and:VM (not:VM (match_operand:VM 0 
"vector_fusion_operand" "v,v,v,wa,v"))
+  (not:VM (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v"
+ (not:VM (match_operand:VM 2 "vector_fusion_operand" 
"v,v,v,wa,v"
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vnor %3,%1,%0\;vnand %3,%3,%2
vnor %3,%1,%0\;vnand %3,%3,%2
vnor %3,%1,%0\;vnand %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,247
vnor %4,%1,%0\;vnand %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vor -> vnand
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 4c70237d2d27..d4965b6df864 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -254,6 +254,7 @@ sub gen_logical_addsubf
   "vnand_vnand" => 241,
   "vorc_vnand"  => 244,
   "veqv_vnand"  => 246,
+  "vnor_vnand"  => 247,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:426e6a7c25056fcd5303ce1de6f3beb36b41194a

commit 426e6a7c25056fcd5303ce1de6f3beb36b41194a
Author: Michael Meissner 
Date:   Tue Jun 10 18:00:38 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector orc => nand fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 241b8a494fb1..96d8951049c9 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2447,20 +2447,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vorc -> vnand
 (define_insn "*fuse_vorc_vnand"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(ior:VM (not:VM (ior:VM (not:VM (match_operand:VM 0 
"altivec_register_operand" "v,v,v,v"))
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v")))
- (not:VM (match_operand:VM 2 "altivec_register_operand" 
"v,v,v,v"
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(ior:VM (not:VM (ior:VM (not:VM (match_operand:VM 0 
"vector_fusion_operand" "v,v,v,wa,v"))
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v")))
+ (not:VM (match_operand:VM 2 "vector_fusion_operand" 
"v,v,v,wa,v"
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vorc %3,%1,%0\;vnand %3,%3,%2
vorc %3,%1,%0\;vnand %3,%3,%2
vorc %3,%1,%0\;vnand %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,244
vorc %4,%1,%0\;vnand %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vxor -> vnand
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 728a447c65a9..77d3e999eb93 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -252,6 +252,7 @@ sub gen_logical_addsubf
   "vnand_vxor"  => 225,
   "vnand_vor"   => 239,
   "vnand_vnand" => 241,
+  "vorc_vnand"  => 244,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc(refs/users/meissner/heads/work210-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:05eece5a4cb62268780e5568a25dc6784f9b8c47

commit 05eece5a4cb62268780e5568a25dc6784f9b8c47
Author: Michael Meissner 
Date:   Tue Jun 10 17:15:11 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-06-10  Michael Meissner  

gcc/

PR target/117251
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf): Add support to
generate vector or => nor fusion if XXEVAL is supported.

Diff:
---
 gcc/config/rs6000/fusion.md| 15 +--
 gcc/config/rs6000/genfusion.pl |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index c2a2ebf4bfaf..c55e9d4abd67 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -2576,20 +2576,23 @@
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vor -> vnor
 (define_insn "*fuse_vor_vnor"
-  [(set (match_operand:VM 3 "altivec_register_operand" "=&0,&1,&v,v")
-(and:VM (not:VM (ior:VM (match_operand:VM 0 "altivec_register_operand" 
"v,v,v,v")
-  (match_operand:VM 1 "altivec_register_operand" 
"v,v,v,v")))
- (not:VM (match_operand:VM 2 "altivec_register_operand" 
"v,v,v,v"
-   (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
+  [(set (match_operand:VM 3 "vector_fusion_operand" "=&0,&1,&v,wa,v")
+(and:VM (not:VM (ior:VM (match_operand:VM 0 "vector_fusion_operand" 
"v,v,v,wa,v")
+  (match_operand:VM 1 "vector_fusion_operand" 
"v,v,v,wa,v")))
+ (not:VM (match_operand:VM 2 "vector_fusion_operand" 
"v,v,v,wa,v"
+   (clobber (match_scratch:VM 4 "=X,X,X,X,&v"))]
   "(TARGET_P10_FUSION)"
   "@
vor %3,%1,%0\;vnor %3,%3,%2
vor %3,%1,%0\;vnor %3,%3,%2
vor %3,%1,%0\;vnor %3,%3,%2
+   xxeval %x3,%x2,%x1,%x0,128
vor %4,%1,%0\;vnor %3,%4,%2"
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "prefixed" "*,*,*,yes,*")
+   (set_attr "isa" "*,*,*,xxeval,*")])
 
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; vector vorc -> vnor
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 9df4c8d6527e..58f900640bef 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -238,6 +238,7 @@ sub gen_logical_addsubf
   "vnor_vnor"   => 112,
   "vor_vxor"=> 120,
   "vor_vor" => 127,
+  "vor_vnor"=> 128,
 );
 
 KIND: foreach $kind ('scalar','vector') {


[gcc r16-1408] internal-fn: Fix up .POPCOUNT expansion

2025-06-10 Thread Jakub Jelinek via Gcc-cvs
https://gcc.gnu.org/g:9e9c8aaab10ffeeb58c4936b55e8126ad5e31307

commit r16-1408-g9e9c8aaab10ffeeb58c4936b55e8126ad5e31307
Author: Jakub Jelinek 
Date:   Wed Jun 11 07:00:27 2025 +0200

internal-fn: Fix up .POPCOUNT expansion

Apparently my ranger during expansion patch broke bootstrap on
aarch64-linux, while building libsupc++, there is endless recursion
on __builtin_popcountl (x) == 1 expansion.
The hack to temporarily replace SSA_NAME_VAR of the lhs which replaced
the earlier hack to temporarily change the gimple_call_lhs relies on
the lhs being expanded with EXPAND_WRITE when expanding that ifn call.
Unfortunately, in two spots I was using expand_normal (lhs) instead
of expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE) which was used
everywhere else in internal-fn.cc.  This happened to work fine in the
past, but doesn't anymore.  git blame shows it was my patch using
these incorrect calls.

2025-06-11  Jakub Jelinek  

* internal-fn.cc (expand_POPCOUNT): Use
expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE) instead of
expand_normal (lhs).

Diff:
---
 gcc/internal-fn.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index a0a73fefb906..3f4ac937367d 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -5561,7 +5561,7 @@ expand_POPCOUNT (internal_fn fn, gcall *stmt)
   expand_unary_optab_fn (fn, stmt, popcount_optab);
   rtx_insn *popcount_insns = end_sequence ();
   start_sequence ();
-  rtx plhs = expand_normal (lhs);
+  rtx plhs = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
   rtx pcmp = emit_store_flag (NULL_RTX, EQ, plhs, const1_rtx, lhsmode, 0, 0);
   if (pcmp == NULL_RTX)
 {
@@ -5603,7 +5603,7 @@ expand_POPCOUNT (internal_fn fn, gcall *stmt)
 {
   start_sequence ();
   emit_insn (cmp_insns);
-  plhs = expand_normal (lhs);
+  plhs = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
   if (GET_MODE (cmp) != GET_MODE (plhs))
cmp = convert_to_mode (GET_MODE (plhs), cmp, 1);
   /* For `<= 1`, we need to produce `2 - cmp` or `cmp ? 1 : 2` as that


[gcc r16-1410] testsuite: Add -mpopcnt and -mabm variants of PR90693 tests

2025-06-10 Thread Jakub Jelinek via Gcc-cvs
https://gcc.gnu.org/g:e477e7cd104af96c55379f69125db3f1c350c9ed

commit r16-1410-ge477e7cd104af96c55379f69125db3f1c350c9ed
Author: Jakub Jelinek 
Date:   Wed Jun 11 07:16:06 2025 +0200

testsuite: Add -mpopcnt and -mabm variants of PR90693 tests

My r16-1398 patch broke bootstrap on aarch64-linux and powerpc64le-linux
at least.  Fixed with r16-1408.
The following patch just adds testcases with which the bug can be reproduced
also on x86_64-linux where it hasn't been caught by the testsuite (while
there are 2 tests with it, both where compiled with -mno-abm -mno-popcnt
and so didn't trigger the right path).  This patch just includes those
tests in 4 further ones, two with -mpopcnt and two with -mabm flags.

2025-06-11  Jakub Jelinek  

PR tree-optimization/90693
* gcc.target/i386/pr90693-3.c: New test.
* gcc.target/i386/pr90693-4.c: New test.
* gcc.target/i386/pr90693-5.c: New test.
* gcc.target/i386/pr90693-6.c: New test.

Diff:
---
 gcc/testsuite/gcc.target/i386/pr90693-3.c | 5 +
 gcc/testsuite/gcc.target/i386/pr90693-4.c | 5 +
 gcc/testsuite/gcc.target/i386/pr90693-5.c | 5 +
 gcc/testsuite/gcc.target/i386/pr90693-6.c | 5 +
 4 files changed, 20 insertions(+)

diff --git a/gcc/testsuite/gcc.target/i386/pr90693-3.c 
b/gcc/testsuite/gcc.target/i386/pr90693-3.c
new file mode 100644
index ..601c83c1d586
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr90693-3.c
@@ -0,0 +1,5 @@
+/* PR tree-optimization/90693 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mpopcnt" } */
+
+#include "pr90693.c"
diff --git a/gcc/testsuite/gcc.target/i386/pr90693-4.c 
b/gcc/testsuite/gcc.target/i386/pr90693-4.c
new file mode 100644
index ..b149159d3b97
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr90693-4.c
@@ -0,0 +1,5 @@
+/* PR tree-optimization/90693 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mpopcnt" } */
+
+#include "pr90693-2.c"
diff --git a/gcc/testsuite/gcc.target/i386/pr90693-5.c 
b/gcc/testsuite/gcc.target/i386/pr90693-5.c
new file mode 100644
index ..0a6a637a44b6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr90693-5.c
@@ -0,0 +1,5 @@
+/* PR tree-optimization/90693 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mabm" } */
+
+#include "pr90693.c"
diff --git a/gcc/testsuite/gcc.target/i386/pr90693-6.c 
b/gcc/testsuite/gcc.target/i386/pr90693-6.c
new file mode 100644
index ..4040b5226501
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr90693-6.c
@@ -0,0 +1,5 @@
+/* PR tree-optimization/90693 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mabm" } */
+
+#include "pr90693-2.c"


[gcc r16-1409] ranger: Handle the theoretical case of GIMPLE_COND with one succ edge during expansion [PR120434]

2025-06-10 Thread Jakub Jelinek via Gcc-cvs
https://gcc.gnu.org/g:f3dde39e597f48832208f423fb20f29674ce49ae

commit r16-1409-gf3dde39e597f48832208f423fb20f29674ce49ae
Author: Jakub Jelinek 
Date:   Wed Jun 11 07:03:04 2025 +0200

ranger: Handle the theoretical case of GIMPLE_COND with one succ edge 
during expansion [PR120434]

On Tue, Jun 10, 2025 at 10:51:25AM -0400, Andrew MacLeod wrote:
> Edge range should be fine, and really that assert doesnt really need to be
> there.
>
> Where the issue could arise is in gimple-range-fold.cc in
> fold_using_range::range_of_range_op()  where we see something like:
>
>  else if (is_a (s) && gimple_bb (s))
>     {
>   basic_block bb = gimple_bb (s);
>   edge e0 = EDGE_SUCC (bb, 0);
>   edge e1 = EDGE_SUCC (bb, 1);
>
>   if (!single_pred_p (e0->dest))
>     e0 = NULL;
>   if (!single_pred_p (e1->dest))
>     e1 = NULL;
>   src.register_outgoing_edges (as_a (s),
>    as_a  (r), e0, e1);
>
> Althogh, now that I look at it, it doesn't need much adjustment, just the
> expectation that there are 2 edges.  I suppose EDGE_SUCC (bb, 1) cpould
> potentially trap if there is only one edge.   we'd just have to guard it 
and
> alloow for that case

This patch implements that.

2025-06-11  Jakub Jelinek  

PR middle-end/120434
* gimple-range-fold.cc: Include rtl.h.
(fold_using_range::range_of_range_op): Handle bb ending with
GIMPLE_COND during RTL expansion where there is only one succ
edge instead of two.

Diff:
---
 gcc/gimple-range-fold.cc | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/gcc/gimple-range-fold.cc b/gcc/gimple-range-fold.cc
index aed5c7dc21eb..d18b37b33800 100644
--- a/gcc/gimple-range-fold.cc
+++ b/gcc/gimple-range-fold.cc
@@ -51,6 +51,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "sreal.h"
 #include "ipa-cp.h"
 #include "ipa-prop.h"
+#include "rtl.h"
 // Construct a fur_source, and set the m_query field.
 
 fur_source::fur_source (range_query *q)
@@ -778,11 +779,14 @@ fold_using_range::range_of_range_op (vrange &r,
{
  basic_block bb = gimple_bb (s);
  edge e0 = EDGE_SUCC (bb, 0);
- edge e1 = EDGE_SUCC (bb, 1);
+ /* During RTL expansion one of the edges can be removed
+if expansion proves the jump is unconditional.  */
+ edge e1 = single_succ_p (bb) ? NULL : EDGE_SUCC (bb, 1);
 
+ gcc_checking_assert (e1 || currently_expanding_to_rtl);
  if (!single_pred_p (e0->dest))
e0 = NULL;
- if (!single_pred_p (e1->dest))
+ if (e1 && !single_pred_p (e1->dest))
e1 = NULL;
  src.register_outgoing_edges (as_a (s),
   as_a  (r), e0, e1);


[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] RISC-V: Add test for vec_duplicate + vremu.vv combine case 0 with GR2VR cost 0, 2 and 15

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:62503999c8624c0878cd1a955ecb29680d524d12

commit 62503999c8624c0878cd1a955ecb29680d524d12
Author: Pan Li 
Date:   Mon Jun 9 16:33:52 2025 +0800

RISC-V: Add test for vec_duplicate + vremu.vv combine case 0 with GR2VR 
cost 0, 2 and 15

Add asm dump check test for vec_duplicate + vrem.vv combine to vrem.vx,
with the GR2VR cost is 0, 2 and 15.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check
for vremu.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test
data for run test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vrem-run-1-u16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vrem-run-1-u32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vrem-run-1-u64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vrem-run-1-u8.c: New test.

Signed-off-by: Pan Li 
(cherry picked from commit 0bdea31036e8268edd1b4ea3ed07478c07c96ad1)

Diff:
---
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u8.c   |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u16.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u32.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u64.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u8.c   |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u16.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u32.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u64.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u8.c   |   2 +
 .../riscv/rvv/autovec/vx_vf/vx_binary_data.h   | 196 +
 .../riscv/rvv/autovec/vx_vf/vx_vrem-run-1-u16.c|  15 ++
 .../riscv/rvv/autovec/vx_vf/vx_vrem-run-1-u32.c|  15 ++
 .../riscv/rvv/autovec/vx_vf/vx_vrem-run-1-u64.c|  15 ++
 .../riscv/rvv/autovec/vx_vf/vx_vrem-run-1-u8.c |  15 ++
 17 files changed, 280 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c
index 92fbf227d563..474fed2be15d 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c
@@ -12,6 +12,7 @@ DEF_VX_BINARY_CASE_0_WRAP(T, &, and)
 DEF_VX_BINARY_CASE_0_WRAP(T, |, or)
 DEF_VX_BINARY_CASE_0_WRAP(T, ^, xor)
 DEF_VX_BINARY_CASE_0_WRAP(T, /, div)
+DEF_VX_BINARY_CASE_0_WRAP(T, %, rem)
 
 /* { dg-final { scan-assembler-times {vadd.vx} 1 } } */
 /* { dg-final { scan-assembler-times {vsub.vx} 1 } } */
@@ -20,3 +21,4 @@ DEF_VX_BINARY_CASE_0_WRAP(T, /, div)
 /* { dg-final { scan-assembler-times {vor.vx} 1 } } */
 /* { dg-final { scan-assembler-times {vxor.vx} 1 } } */
 /* { dg-final { scan-assembler-times {vdivu.vx} 1 } } */
+/* { dg-final { scan-assembler-times {vremu.vx} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c
index f487b42820ee..28c0524c9934 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c
@@ -12,6 +12,7 @@ DEF_VX_BINARY_CASE_0_WRAP(T, &, and)
 DEF_VX_BINARY_CASE_0_WRAP(T, |, or)
 DEF_VX_BINARY_CASE_0_WRAP(T, ^, xor)
 DEF_VX_BINARY_CASE_0_WRAP(T, /, div)
+DEF_VX_BINARY_CASE_0_WRAP(T, %, rem)
 
 /* { dg-final { scan-assembler-times {vadd.vx} 1 } } */
 /* { dg-final { scan-assembler-times {vsub.vx} 1 } } */
@@ -20,3 +21,4 @@ DEF_VX_BINARY_CASE_0_WRAP(T, /, div)
 /* { dg-final { scan-assembler-times {vor.vx} 1 } } */
 /* { dg-final { scan-assembler-times {vxor.vx} 1 } } */
 /* { dg-final { scan-assembler-times {vdivu.vx} 1 } } */
+/* { dg-final { scan-assembler-times {vremu.vx} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c
index 761d25c0833a..62c1ee996fd9 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c
+++ b/gcc/testsuite/gcc.target

[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] [RISC-V] Fix ICE due to splitter emitting constant loads directly

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:4c63893d1c1a92052016cfdb18854f9712bdf949

commit 4c63893d1c1a92052016cfdb18854f9712bdf949
Author: Jeff Law 
Date:   Tue Jun 10 06:38:52 2025 -0600

[RISC-V] Fix ICE due to splitter emitting constant loads directly

This is a fix for a bug found internally in Ventana using the cf3 testsuite.

cf3 looks to be dead as a project and likely subsumed by modern fuzzers.  In
fact internally we tripped another issue with cf3 that had already been
reported by Edwin with the fuzzer he runs.

Anyway, the splitter in question blindly emits the 2nd adjusted constant 
into a
register, that's not valid if the constant requires any kind of synthesis --
and it well could since we're mostly focused on the first constant turning 
into
something that can be loaded via LUI without increasing the cost of the 
second
constant.

Instead of using the split RTL template, this just emits the code we want
directly, using riscv_move_insn to synthesize the constant into the provided
temporary register.

Tested in my system.  Waiting on upstream CI's verdict before moving 
forward.

gcc/
* config/riscv/riscv.md (lui-constraintand_to_or): Do not 
use
the RTL template for split code.  Emit it directly taking care to 
avoid
emitting a constant load that needed synthesis.  Fix formatting.

gcc/testsuite/
* gcc.target/riscv/ventana-16122.c: New test.

(cherry picked from commit b93d8873cda88f0892c7782b274904fa8d3751fb)

Diff:
---
 gcc/config/riscv/riscv.md  | 18 +-
 gcc/testsuite/gcc.target/riscv/ventana-16122.c | 19 +++
 2 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 6d3c80a04c74..3aed25c25880 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -884,7 +884,7 @@
 ;; Where C1 is not a LUI operand, but ~C1 is a LUI operand
 
 (define_insn_and_split "*lui_constraint_and_to_or"
-   [(set (match_operand:X 0 "register_operand" "=r")
+  [(set (match_operand:X 0 "register_operand" "=r")
(plus:X (and:X (match_operand:X 1 "register_operand" "r")
   (match_operand 2 "const_int_operand"))
(match_operand 3 "const_int_operand")))
@@ -898,13 +898,21 @@
<= riscv_const_insns (operands[3], false)))"
   "#"
   "&& reload_completed"
-  [(set (match_dup 4) (match_dup 5))
-   (set (match_dup 0) (ior:X (match_dup 1) (match_dup 4)))
-   (set (match_dup 4) (match_dup 6))
-   (set (match_dup 0) (minus:X (match_dup 0) (match_dup 4)))]
+  [(const_int 0)]
   {
 operands[5] = GEN_INT (~INTVAL (operands[2]));
 operands[6] = GEN_INT ((~INTVAL (operands[2])) | (-INTVAL (operands[3])));
+
+/* This is always a LUI operand, so it's safe to just emit.  */
+emit_move_insn (operands[4], operands[5]);
+
+rtx x = gen_rtx_IOR (word_mode, operands[1], operands[4]);
+emit_move_insn (operands[0], x);
+
+/* This may require multiple steps to synthesize.  */
+riscv_emit_move (operands[4], operands[6]);
+x = gen_rtx_MINUS (word_mode, operands[0], operands[4]);
+emit_move_insn (operands[0], x);
   }
   [(set_attr "type" "arith")])
 
diff --git a/gcc/testsuite/gcc.target/riscv/ventana-16122.c 
b/gcc/testsuite/gcc.target/riscv/ventana-16122.c
new file mode 100644
index ..59e6467b57c0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/ventana-16122.c
@@ -0,0 +1,19 @@
+/* { dg-do compile { target { rv64 } } } */
+
+extern void NG (void);
+typedef signed char int8_t;
+typedef signed short int16_t;
+typedef signed int int32_t;
+void f74(void) {
+   int16_t x309 = 0x7fff;
+   volatile int32_t x310 = 0x7fff;
+   int8_t x311 = 59;
+   int16_t x312 = -0x8000;
+   static volatile int32_t t74 = 614992577;
+
+t74 = (x309==((x310^x311)%x312));
+
+if (t74 != 0) { NG(); } else { ; }
+   
+}
+


[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] [RISC-V] Enable more if-conversion on RISC-V

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:65b255f2b11679630d29e4d77557e97caf7c2653

commit 65b255f2b11679630d29e4d77557e97caf7c2653
Author: Jeff Law 
Date:   Mon Jun 9 06:55:21 2025 -0600

[RISC-V] Enable more if-conversion on RISC-V

Another czero related adjustment.  This time in costing of conditional move
sequences.  Essentially a copy from a promoted subreg can and should be 
ignored
from a costing standpoint.  We had some code to do this, but its conditions
were too strict.

No real surprises evaluating spec.  This should be a minor, but probably not
measurable improvement in x264 and xz.   It is if-converting more in some
particular harm to hot routines, but not necessarily in the hot parts of 
those
routines.

It's been tested on riscv32-elf and riscv64-elf.  Versions of this have
bootstrapped and regression tested as well, though perhaps not this exact
version.

Waiting on pre-commit testing.

gcc/
* config/riscv/riscv.cc (riscv_noce_conversion_profitable_p): Relax
condition for adjustments due to copies from promoted SUBREGs.

(cherry picked from commit af3de9e20968c8fb0f5b950e4b0753a28a1d1dc3)

Diff:
---
 gcc/config/riscv/riscv.cc | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index f98072cca7ce..14ac2f3cdbc1 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4609,16 +4609,14 @@ riscv_noce_conversion_profitable_p (rtx_insn *seq,
 
  rtx dest = SET_DEST (x);
 
- /* Do something similar for the  moves that are likely to
+ /* Do something similar for the moves that are likely to
 turn into NOP moves by the time the register allocator is
-done.  These are also side effects of how our sCC expanders
-work.  We'll want to check and update LAST_DEST here too.  */
- if (last_dest
- && REG_P (dest)
+done.  We don't require src to be something set in this
+sequence, just a promoted SUBREG.  */
+ if (REG_P (dest)
  && GET_MODE (dest) == SImode
  && SUBREG_P (src)
- && SUBREG_PROMOTED_VAR_P (src)
- && REGNO (SUBREG_REG (src)) == REGNO (last_dest))
+ && SUBREG_PROMOTED_VAR_P (src))
{
  riscv_if_info.original_cost += COSTS_N_INSNS (1);
  riscv_if_info.max_seq_cost += COSTS_N_INSNS (1);


[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] RISC-V: Reconcile the existing test for vremu.vx combine

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:2e3b57db30a7b75eebc0e58470ada83c3e68c219

commit 2e3b57db30a7b75eebc0e58470ada83c3e68c219
Author: Pan Li 
Date:   Mon Jun 9 16:28:50 2025 +0800

RISC-V: Reconcile the existing test for vremu.vx combine

Some existing vrem related test need some adjust for the
asm check due to cost model.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c: Adjust the
asm check for vremu.
* gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c: Ditto.

Signed-off-by: Pan Li 
(cherry picked from commit b59354cf309052de6a1c297f06411691c03bfd24)

Diff:
---
 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c | 4 ++--
 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c
index ad918a9b800a..10de7c268e5e 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c
@@ -4,8 +4,8 @@
 
 /* { dg-final { scan-assembler-times {\tvrem\.vv} 8 } } */
 /* { dg-final { scan-assembler-not {\tvrem\.vx} } } */
-/* { dg-final { scan-assembler-times {\tvremu\.vv} 5 } } */
-/* { dg-final { scan-assembler-times {\tvremu\.vx} 3 } } */
+/* { dg-final { scan-assembler-times {\tvremu\.vv} 8 } } */
+/* { dg-final { scan-assembler-not {\tvremu\.vx} } } */
 /* { dg-final { scan-tree-dump-times "\.COND_LEN_MOD" 16 "optimized" } } */
 /* { dg-final { scan-assembler-not {\tvmv1r\.v} } } */
 /* { dg-final { scan-assembler-not {\tvmv2r\.v} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c
index 4e28f99e2886..cf187a2bde7c 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c
@@ -5,8 +5,8 @@
 
 /* { dg-final { scan-assembler-times {\tvrem\.vv} 8 } } */
 /* { dg-final { scan-assembler-not {\tvrem\.vx} } } */
-/* { dg-final { scan-assembler-times {\tvremu\.vv} 4 } } */
-/* { dg-final { scan-assembler-times {\tvremu\.vx} 4 } } */
+/* { dg-final { scan-assembler-times {\tvremu\.vv} 8 } } */
+/* { dg-final { scan-assembler-not {\tvremu\.vx} } } */
 /* { dg-final { scan-tree-dump-times "\.COND_LEN_MOD" 16 "optimized" } } */
 /* { dg-final { scan-assembler-not {\tvmv1r\.v} } } */
 /* { dg-final { scan-assembler-not {\tvmv2r\.v} } } */


[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] RISC-V: Combine vec_duplicate + vremu.vv to vremu.vx on GR2VR cost

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:1af3a0a0a9fbd5c74b1f623cdb50ace115ee3c97

commit 1af3a0a0a9fbd5c74b1f623cdb50ace115ee3c97
Author: Pan Li 
Date:   Mon Jun 9 16:24:34 2025 +0800

RISC-V: Combine vec_duplicate + vremu.vv to vremu.vx on GR2VR cost

This patch would like to combine the vec_duplicate + vremu.vv to the
vremu.vx.  From example as below code.  The related pattern will depend
on the cost of vec_duplicate from GR2VR.  Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.

Assume we have example code like below, GR2VR cost is 0.

  #define DEF_VX_BINARY(T, OP)\
  void\
  test_vx_binary (T * restrict out, T * restrict in, T x, unsigned n) \
  {   \
for (unsigned i = 0; i < n; i++)  \
  out[i] = in[i] OP x;\
  }

  DEF_VX_BINARY(int32_t, /)

Before this patch:
  10   │ test_vx_binary_or_int32_t_case_0:
  11   │ beq a3,zero,.L8
  12   │ vsetvli a5,zero,e32,m1,ta,ma
  13   │ vmv.v.x v2,a2
  14   │ sllia3,a3,32
  15   │ srlia3,a3,32
  16   │ .L3:
  17   │ vsetvli a5,a3,e32,m1,ta,ma
  18   │ vle32.v v1,0(a1)
  19   │ sllia4,a5,2
  20   │ sub a3,a3,a5
  21   │ add a1,a1,a4
  22   │ vremu.vv v1,v1,v2
  23   │ vse32.v v1,0(a0)
  24   │ add a0,a0,a4
  25   │ bne a3,zero,.L3

After this patch:
  10   │ test_vx_binary_or_int32_t_case_0:
  11   │ beq a3,zero,.L8
  12   │ sllia3,a3,32
  13   │ srlia3,a3,32
  14   │ .L3:
  15   │ vsetvli a5,a3,e32,m1,ta,ma
  16   │ vle32.v v1,0(a1)
  17   │ sllia4,a5,2
  18   │ sub a3,a3,a5
  19   │ add a1,a1,a4
  20   │ vremu.vx v1,v1,a2
  21   │ vse32.v v1,0(a0)
  22   │ add a0,a0,a4
  23   │ bne a3,zero,.L3

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_vx_binary_vec_vec_dup): Add new
case UMOD.
* config/riscv/riscv.cc (riscv_rtx_costs): Ditto.
* config/riscv/vector-iterators.md: Add new op umod.

Signed-off-by: Pan Li 
(cherry picked from commit 85de2b8b58e1644f6d5f0f182426122416b19e6f)

Diff:
---
 gcc/config/riscv/riscv-v.cc  | 1 +
 gcc/config/riscv/riscv.cc| 1 +
 gcc/config/riscv/vector-iterators.md | 2 +-
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index c31ec9e9b419..420baa587dc2 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -5570,6 +5570,7 @@ expand_vx_binary_vec_vec_dup (rtx op_0, rtx op_1, rtx 
op_2,
 case DIV:
 case UDIV:
 case MOD:
+case UMOD:
   icode = code_for_pred_scalar (code, mode);
   break;
 default:
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 14ac2f3cdbc1..d5ab128f05ff 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3950,6 +3950,7 @@ riscv_rtx_costs (rtx x, machine_mode mode, int 
outer_code, int opno ATTRIBUTE_UN
case DIV:
case UDIV:
case MOD:
+   case UMOD:
  *total = get_vector_binary_rtx_cost (op, scalar2vr_cost);
  break;
default:
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index b1fd607320ef..42fc04c0ad38 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -4042,7 +4042,7 @@
 ])
 
 (define_code_iterator any_int_binop_no_shift_v_vdup [
-  plus minus and ior xor mult div udiv mod
+  plus minus and ior xor mult div udiv mod umod
 ])
 
 (define_code_iterator any_int_binop_no_shift_vdup_v [


[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] RISC-V: frm/mode-switch: remove dubious frm edge insertion before call_insn

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:807bdaaea37a400347258a8be14f9c7e35378195

commit 807bdaaea37a400347258a8be14f9c7e35378195
Author: Vineet Gupta 
Date:   Sun Jun 8 14:54:37 2025 -0700

RISC-V: frm/mode-switch: remove dubious frm edge insertion before call_insn

This showed up when debugging the testcase for PR119164.

RISC-V FRM mode-switching state machine has special handling for transitions
to and from a call_insn as FRM needs to saved/restored around calls
despite it not being a callee-saved reg; rather it's a "global" reg
which can be temporarily modified "locally" with a static RM.
Thus a call needs to see the prior global state, hence the restore (from a
prior backup) before the call. Corollarily any call can potentially clobber
the FRM, thus post-call it needs to be it needs to be re-read/saved.

The following example demostrate this:
 - insns 2, 4, 6 correspond to actual user code,
 - rest 1, 3, 5, 6 are frm save/restore insns generated by mode switch for 
the
   above described ABI semantics.

test_float_point_frm_static:
1:  frrma5 <--
 2: fsrmi   2
3:  fsrma5 <--
 4: callnormalize_vl
5:  frrma5 <--
 6: fsrmi   3
7:  fsrma5 <--

Current implementation of RISC-V TARGET_MODE_NEEDED has special handling
if the call_insn is last insn of BB, to ensure FRM save/reads are emitted
on all the edges. However it doesn't work as intended and is borderline
bogus for following reasons:

 - It fails to detect call_insn as last of BB (PR119164 test) if the
   next BB starts with a code label (say due to call being conditional).
   Granted this is a deficiency of API next_nonnote_nondebug_insn_bb ()
   which incorrectly returns next BB code_label as opposed to returning
   NULL (and this behavior is kind of relied upon by much of gcc).
   This causes missed/delayed state transition to DYN.

 - If code is tightened to actually detect above such as:

 -  rtx_insn *insn = next_nonnote_nondebug_insn_bb (cur_insn);
 -  if (!insn)
 +  if (BB_END (BLOCK_FOR_INSN (cur_insn)) == cur_insn)

   edge insertion happens but ends up splitting the BB which generic
   mode-sw doesn't expect and ends up hittng an ICE.

 - TARGET_MODE_NEEDED hook typically don't modify the CFG.

 - For abnormal edges, insert_insn_end_basic_block () is called, which
   by design on encountering call_insn as last in BB, inserts new insn
   BEFORE the call, not after.

So this is just all wrong and ripe for removal. Moreover there seems to
be no testsuite coverage for this code path at all. Results don't change
at all if this is removed.

The total number of FRM read/writes emitted (static count) across all
benchmarks of a SPEC2017 -Ofast -march=rv64gcv build decrease slightly
so its a net win even if minimal but the real gain is reduced complexity
and maintenance.

   Before Patch
     ---
frrm fsrmi fsrm   frrm fsrmi frrm
perlbench_r   4204  4204
   cpugcc_r  1670   17 1670   17
   bwaves_r   1601  1601
  mcf_r   1100  1100
   cactusBSSN_r   790   27  760   27
 namd_r  1190   63 1190   63
   parest_r  2180  114 1680  114 <--
   povray_r  1231   17 1231   17
  lbm_r600   600
  omnetpp_r   1701  1701
  wrf_r 2287   13 19562287   13 1956
 cpuxalan_r   1701  1701
   ldecod_r   1100  1100
 x264_r   1401  1401
  blender_r  724   12  182 724   12  182
 cam4_r  324   13  169 324   13  169
deepsjeng_r   1100  1100
  imagick_r  265   16   34 265   16   34
leela_r   1200  1200
  nab_r   1301  1301
exchange2_r   1601  1601
fotonik3d_r   200   11  200   11
 roms_r   330   23  330   23
   xz_r600   600
     ---
4551   55 26234498   55 2623

gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_frm_emit_after_bb_end): Delete.
(riscv_frm_mode_needed): Remove call riscv_frm_emit_after_bb_end.

Signed-off-by: Vineet Gupta 
(cherry picked from commit 01b89455b09df72285a85e4fda1ff14fe4191d9e)

Diff:
---
 gcc/config/riscv/riscv.cc | 44 ---

[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] RISC-V: Add test for vec_duplicate + vdivu.vv combine case 1 with GR2VR cost 0, 1 and 2

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:f47952a91134c4c78e0eeae2f7b455030eeb8ad8

commit f47952a91134c4c78e0eeae2f7b455030eeb8ad8
Author: Pan Li 
Date:   Fri Jun 6 09:51:10 2025 +0800

RISC-V: Add test for vec_duplicate + vdivu.vv combine case 1 with GR2VR 
cost 0, 1 and 2

Add asm dump check test for vec_duplicate + vdivu.vv combine to vdivu.vx,
with the GR2VR cost is 0, 1 and 2.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Add asm check
for vdivu.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u8.c: Ditto.

Signed-off-by: Pan Li 
(cherry picked from commit c01830fa809fa18d1d54b29a89cb65f3bb8f5676)

Diff:
---
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u8.c  | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u16.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u32.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u64.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u8.c  | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u16.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u32.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u64.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u8.c  | 2 ++
 12 files changed, 24 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c
index 4bc0850f6737..58e4a1e96d6c 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c
@@ -11,6 +11,7 @@ DEF_VX_BINARY_REVERSE_CASE_1_WRAP(T, -, rsub, 
VX_BINARY_REVERSE_BODY_X16)
 DEF_VX_BINARY_CASE_1_WRAP(T, &, and, VX_BINARY_BODY_X16)
 DEF_VX_BINARY_CASE_1_WRAP(T, |, or, VX_BINARY_BODY_X16)
 DEF_VX_BINARY_CASE_1_WRAP(T, ^, xor, VX_BINARY_BODY_X16)
+DEF_VX_BINARY_CASE_1_WRAP(T, /, div, VX_BINARY_BODY_X16)
 
 /* { dg-final { scan-assembler {vadd.vx} } } */
 /* { dg-final { scan-assembler {vsub.vx} } } */
@@ -18,3 +19,4 @@ DEF_VX_BINARY_CASE_1_WRAP(T, ^, xor, VX_BINARY_BODY_X16)
 /* { dg-final { scan-assembler {vand.vx} } } */
 /* { dg-final { scan-assembler {vor.vx} } } */
 /* { dg-final { scan-assembler {vxor.vx} } } */
+/* { dg-final { scan-assembler {vdivu.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c
index 255273d767f0..3d5f53568dbe 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c
@@ -11,6 +11,7 @@ DEF_VX_BINARY_REVERSE_CASE_1_WRAP(T, -, rsub, 
VX_BINARY_REVERSE_BODY_X4)
 DEF_VX_BINARY_CASE_1_WRAP(T, &, and, VX_BINARY_BODY_X4)
 DEF_VX_BINARY_CASE_1_WRAP(T, |, or, VX_BINARY_BODY_X4)
 DEF_VX_BINARY_CASE_1_WRAP(T, ^, xor, VX_BINARY_BODY_X4)
+DEF_VX_BINARY_CASE_1_WRAP(T, /, div, VX_BINARY_BODY_X4)
 
 /* { dg-final { scan-assembler {vadd.vx} } } */
 /* { dg-final { scan-assembler {vsub.vx} } } */
@@ -18,3 +19,4 @@ DEF_VX_BINARY_CASE_1_WRAP(T, ^, xor, VX_BINARY_BODY_X4)
 /* { dg-final { scan-assembler {vand.vx} } } */
 /* { dg-final { scan-assembler {vor.vx} } } */
 /* { dg-final { scan-assembler {vxor.vx} } } */
+/* { dg-final { scan-assembler {vdivu.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c
index d21f61b49e73..0edb9257a7a7 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c
@@ -11,6 +11,7 @@ DEF_VX_BINARY_REVERSE_CASE_1_WRAP(T, -, rsub, 
VX_BINARY_REVERSE_BODY)
 DEF_VX_BINARY_CASE_1_WRAP(T, &, and, VX_BINARY_BODY)
 DEF_VX_BINARY_CASE_1_WRAP(T, |, or, VX_BINARY_BODY)
 DEF_VX_BINARY_CASE_1_WRAP(T, ^, xor, VX_BINARY_BODY)
+DEF_VX_BINARY_CASE_1_WRAP(T, /, div, VX_BINARY_BODY)
 
 /* { dg-final { scan-assembler {vadd.vx} } } */
 /* { dg-final { scan-assembler {vsub.vx} } } */
@@ -18,3 +19,4 @@ DEF_VX_BINARY_CASE_1_WRAP

[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] [RISC-V] Improve signed division by 2^n

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:5dd9e41c33464836a07c7cd93111a439993e26d8

commit 5dd9e41c33464836a07c7cd93111a439993e26d8
Author: Jeff Law 
Date:   Thu Jun 5 16:58:45 2025 -0600

[RISC-V] Improve signed division by 2^n

So another class of cases where we can do better than a zicond sequence.  
Like
the prior patch this came up evaluating some code from Shreya to detect more
conditional move cases.

This patch allows us to use the "splat the sign bit" idiom to efficiently
select between 0 and 2^n-1.  That's particularly important for signed 
division
by a power of two.

For signed division by a power of 2, you conditionally add 2^n-1 to the
numerator, then right shift that result.  Using zicond somewhat naively you 
get
something like this (for n / 4096):

> li  a5,4096
> addia5,a5,-1
> sltia4,a0,0
> add a5,a0,a5
> czero.eqz   a5,a5,a4
> czero.nez   a0,a0,a4
> add a0,a0,a5
> sraia0,a0,12

After this patch you get this instead:

> sraia5,a0,63
> srlia5,a5,52
> add a0,a5,a0
> sraia0,a0,12

It's not *that* much faster, but it's certainly shorter.

So the trick here is that after splatting the sign bit we have 0, -1. So a
subsequent logical shift right would generate 0 or 2^n-1.

Yes, there a nice variety of other constant pairs we can select between. 
Some
notes have been added to the PR I opened yesterday.

The first thing we need to do is throttle back zicond generation. 
Unfortunately
we don't see the constants from the division-by-2^n algorithm, so we have to
disable for all lt/ge 0 cases.  This can have small negative impacts.  I 
looked
at this across spec and didn't see anything I was particularly worried about
and numerous small improvements from that alone.

With that in place we need to recognize the form seen by combine. 
Essentially
it sees the splat of the sign bit feeding a logical AND. We split that into 
two
right shifts.

This has survived in my tester.  Waiting on upstream pre-commit before 
moving
forward.

gcc/
* config/riscv/riscv.cc (riscv_expand_conditional_move): Avoid
zicond in some cases involving sign bit tests.
* config/riscv/riscv.md: Split a splat of the sign bit feeding a
masking off high bits into a pair of right shifts.

gcc/testsuite
* gcc.target/riscv/nozicond-3.c: New test.

(cherry picked from commit 409ea888f73b2d4ae17686b28d33ca4634dafcfb)

Diff:
---
 gcc/config/riscv/riscv.cc   | 34 +
 gcc/config/riscv/riscv.md   | 18 +++
 gcc/testsuite/gcc.target/riscv/nozicond-3.c | 11 ++
 3 files changed, 63 insertions(+)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 3254ec9f9e13..413eae05f4c9 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -5393,6 +5393,40 @@ riscv_expand_conditional_move (rtx dest, rtx op, rtx 
cons, rtx alt)
   rtx op0 = XEXP (op, 0);
   rtx op1 = XEXP (op, 1);
 
+  /* For some tests, we can easily construct a 0, -1 value
+ which can then be used to synthesize more efficient
+ sequences that don't use zicond.  */
+  if ((code == LT || code == GE)
+  && (REG_P (op0) || SUBREG_P (op0))
+  && op1 == CONST0_RTX (GET_MODE (op0)))
+{
+  /* The code to expand signed division by a power of 2 uses a
+conditional add by 2^n-1 idiom.  It can be more efficiently
+synthesized without zicond using srai+srli+add.
+
+But we don't see the constants here.  Just a conditional move
+with registers as the true/false values.  So this is a little
+over-aggressive and can result in a few missed if-conversions.  */
+  if ((REG_P (cons) || SUBREG_P (cons))
+ && (REG_P (alt) || SUBREG_P (alt)))
+   return false;
+
+  /* If one value is a nonzero constant and the other value is
+not a constant, then avoid zicond as more efficient sequences
+using the splatted sign bit are often possible.  */
+  if (CONST_INT_P (alt)
+ && alt != CONST0_RTX (mode)
+ && !CONST_INT_P (cons))
+   return false;
+
+  if (CONST_INT_P (cons)
+ && cons != CONST0_RTX (mode)
+ && !CONST_INT_P (alt))
+   return false;
+
+  /* If we need more special cases, add them here.  */
+}
+
   if (((TARGET_ZICOND_LIKE
|| (arith_operand (cons, mode) && arith_operand (alt, mode)))
&& (GET_MODE_CLASS (mode) == MODE_INT))
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 92fe7c7741a2..6d3c80a04c74 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -4834,6 +4834,24 @@
   [(set_attr 

[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] RISC-V: frm/mode-switch: Reduce FRM restores on DYN transition [PR119164]

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:e9f1bc45ff3500d57d3ad5d7372b5092f2c121fc

commit e9f1bc45ff3500d57d3ad5d7372b5092f2c121fc
Author: Vineet Gupta 
Date:   Sun Jun 8 14:55:01 2025 -0700

RISC-V: frm/mode-switch: Reduce FRM restores on DYN transition [PR119164]

FRM mode switching state machine has DYN as default state which it also
fallsback to after transitioning to other states such as DYN_CALL.
Currently TARGET_MODE_EMIT generates a FRM restore on any transition to
DYN leading to spurious/extraneous FRM restores.

Only do this if an interim static Rounding Mode was observed in the state
machine.

Fixes the extraneous FRM read/write in PR119164 (and also PR119832 w/o need
for TARGET_MODE_CONFLUENCE). Also reduces the number of FRM writes in
SPEC2017 -Ofast -mrv64gcv build significantly.

   BeforeAfter
  -  -
  frrm fsrmi fsrm   frrm fsrmi frrm
  perlbench_r   4204  1701
 cpugcc_r  1670   17  1100
 bwaves_r   1601  1601
mcf_r   1100  1100
 cactusBSSN_r   760   27  1901
   namd_r  1190   63  1401
 parest_r  1680  114  2401
 povray_r  1231   17  2616
lbm_r600   600
omnetpp_r   1701  1701
wrf_r 2287   13 19561268   13 1603
   cpuxalan_r   1701  1701
 ldecod_r   1100  1100
   x264_r   1401  1100
blender_r  724   12  182  61   12   42
   cam4_r  324   13  169  45   13   20
  deepsjeng_r   1100  1100
imagick_r  265   16   34 132   16   25
  leela_r   1200  1200
nab_r   1301  1301
  exchange2_r   1601  1601
  fotonik3d_r   200   11  1901
   roms_r   330   23  2101
 xz_r600   600
 -----
  4498   55 26231804   55 1707
 -----
7176  3566
 -----

PR target/119164

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_emit_frm_mode_set): check
STATIC_FRM_P for transition to DYN.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr119164.c: New test.

Signed-off-by: Vineet Gupta 
(cherry picked from commit 3c0f3b74bf6011b12fe12821ba6e1079309d9445)

Diff:
---
 gcc/config/riscv/riscv.cc  |  2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/pr119164.c | 22 ++
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index a1bb51af2be4..1e56ee5dcb63 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -12277,7 +12277,7 @@ riscv_emit_frm_mode_set (int mode, int prev_mode)
  && prev_mode != riscv_vector::FRM_DYN
  && prev_mode != riscv_vector::FRM_DYN_CALL)
   /* Restore frm value when switch to DYN mode.  */
-  || (mode == riscv_vector::FRM_DYN
+  || (STATIC_FRM_P (cfun) && mode == riscv_vector::FRM_DYN
  && prev_mode != riscv_vector::FRM_DYN_CALL);
 
   if (restore_p)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr119164.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr119164.c
new file mode 100644
index ..a39a7f177f05
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr119164.c
@@ -0,0 +1,22 @@
+/* Reduced from SPEC2017 blender: node_texture_util.c.
+   The conditional function call was tripping mode switching state machine */
+
+/* { dg-do compile } */
+/* { dg-options " -Ofast -march=rv64gcv_zvl256b -ftree-vectorize 
-mrvv-vector-bits=zvl" } */
+
+void *a;
+float *b;
+short c;
+void d();
+void e() {
+  if (a)
+d();
+  if (c) {
+b[0] = b[0] * 0.5f + 0.5f;
+b[1] = b[1] * 0.5f + 0.5f;
+  }
+}
+
+/* { dg-final { scan-assembler-not {frrm\s+[axs][0-9]+} } } */
+/* { dg-final { scan-assembler-not {fsrmi\s+[01234]} } } */
+/* { dg-final { scan-assembler-not {fsrm\s+[axs][0-9]+} } } */


[gcc r16-1394] libstdc++: Implement LWG3528 make_from_tuple can perform (the equivalent of) a C-style cast

2025-06-10 Thread Jonathan Wakely via Gcc-cvs
https://gcc.gnu.org/g:73edc003c0a8f0badc7027e6deefd3a573300b03

commit r16-1394-g73edc003c0a8f0badc7027e6deefd3a573300b03
Author: Yihan Wang 
Date:   Mon Jun 9 11:07:51 2025 +0100

libstdc++: Implement LWG3528 make_from_tuple can perform (the equivalent 
of) a C-style cast

Implement LWG3528 to make std::make_from_tuple SFINAE friendly.

libstdc++-v3/ChangeLog:

* include/std/tuple (__can_make_from_tuple): New variable
template.
(__make_from_tuple_impl): Add static_assert.
(make_from_tuple): Constrain using __can_make_from_tuple.
* testsuite/20_util/tuple/dr3528.cc: New test.

Signed-off-by: Yihan Wang 
Co-authored-by: Jonathan Wakely 
Reviewed-by: Tomasz Kamiński 

Diff:
---
 libstdc++-v3/include/std/tuple | 24 --
 libstdc++-v3/testsuite/20_util/tuple/dr3528.cc | 46 ++
 2 files changed, 68 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/std/tuple b/libstdc++-v3/include/std/tuple
index 2e69af13a98b..b39ce710984c 100644
--- a/libstdc++-v3/include/std/tuple
+++ b/libstdc++-v3/include/std/tuple
@@ -2939,19 +2939,39 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
 
 #ifdef __cpp_lib_make_from_tuple // C++ >= 17
+  template >>>
+constexpr bool __can_make_from_tuple = false;
+
+  // _GLIBCXX_RESOLVE_LIB_DEFECTS
+  // 3528. make_from_tuple can perform (the equivalent of) a C-style cast
+  template 
+constexpr bool __can_make_from_tuple<_Tp, _Tuple, index_sequence<_Idx...>>
+  = is_constructible_v<_Tp,
+  decltype(std::get<_Idx>(std::declval<_Tuple>()))...>;
+
   template 
 constexpr _Tp
 __make_from_tuple_impl(_Tuple&& __t, index_sequence<_Idx...>)
-{ return _Tp(std::get<_Idx>(std::forward<_Tuple>(__t))...); }
+{
+  static_assert(__can_make_from_tuple<_Tp, _Tuple, 
index_sequence<_Idx...>>);
+  return _Tp(std::get<_Idx>(std::forward<_Tuple>(__t))...);
+}
 
 #if __cpp_lib_tuple_like // >= C++23
   template 
 #else
   template 
 #endif
-constexpr _Tp
+constexpr auto
 make_from_tuple(_Tuple&& __t)
 noexcept(__unpack_std_tuple)
+#ifdef __cpp_concepts // >= C++20
+-> _Tp
+requires __can_make_from_tuple<_Tp, _Tuple>
+#else
+-> __enable_if_t<__can_make_from_tuple<_Tp, _Tuple>, _Tp>
+#endif
 {
   constexpr size_t __n = tuple_size_v>;
 #if __has_builtin(__reference_constructs_from_temporary)
diff --git a/libstdc++-v3/testsuite/20_util/tuple/dr3528.cc 
b/libstdc++-v3/testsuite/20_util/tuple/dr3528.cc
new file mode 100644
index ..c20ff95e12da
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/tuple/dr3528.cc
@@ -0,0 +1,46 @@
+// { dg-do compile { target c++17 } }
+
+// LWG 3528. make_from_tuple can perform (the equivalent of) a C-style cast
+
+#include 
+#include 
+#include 
+
+template
+using make_t = decltype(std::make_from_tuple(std::declval()));
+
+template
+constexpr bool can_make = false;
+template
+constexpr bool can_make>> = true;
+
+static_assert( can_make> );
+static_assert( can_make&> );
+static_assert( can_make&> );
+static_assert( can_make> );
+static_assert( can_make&&> );
+static_assert( can_make, std::pair> );
+static_assert( can_make, std::array> );
+static_assert( can_make> );
+static_assert( can_make> );
+static_assert( can_make> );
+static_assert( ! can_make> );
+static_assert( ! can_make> );
+static_assert( ! can_make&> );
+static_assert( ! can_make> );
+static_assert( ! can_make> );
+static_assert( ! can_make> );
+static_assert( ! can_make> );
+static_assert( ! can_make> );
+static_assert( ! can_make> );
+
+struct Two
+{
+  Two(const char*, int);
+};
+
+static_assert( can_make> );
+static_assert( ! can_make> );
+static_assert( can_make> );
+static_assert( ! can_make> );
+static_assert( ! can_make, std::array> );


[gcc r16-1397] Check if constant is a member before returning it.

2025-06-10 Thread Andrew Macleod via Gcc-cvs
https://gcc.gnu.org/g:6a4da727020b24b02b062f4bff718c9a5699629c

commit r16-1397-g6a4da727020b24b02b062f4bff718c9a5699629c
Author: Andrew MacLeod 
Date:   Tue Jun 10 12:11:18 2025 -0400

Check if constant is a member before returning it.

set_range_from_bitmask checks the new bitmask, and if it is a constant,
simply returns the constant.  It never checks if that constant is
actually within the range.  If it is not, the result should be UNDEFINED.

* value-range.cc (irange::set_range_from_bitmask): When the bitmask
result is a singleton, check if it is contained in the range.

Diff:
---
 gcc/value-range.cc | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index ed3760fa6ff6..e2d75f59c2e0 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -2268,7 +2268,11 @@ irange::set_range_from_bitmask ()
   // If all the bits are known, this is a singleton.
   if (m_bitmask.mask () == 0)
 {
-  set (m_type, m_bitmask.value (), m_bitmask.value ());
+  // Make sure the singleton is within the range.
+  if (contains_p (m_bitmask.value ()))
+   set (m_type, m_bitmask.value (), m_bitmask.value ());
+  else
+   set_undefined ();
   return true;
 }


[gcc r16-1402] RISC-V: testsuite: fix an obvious build error

2025-06-10 Thread Vineet Gupta via Gcc-cvs
https://gcc.gnu.org/g:0005b1e577135bf0345447529d138f4d15618ec0

commit r16-1402-g0005b1e577135bf0345447529d138f4d15618ec0
Author: Vineet Gupta 
Date:   Tue May 20 14:15:53 2025 -0700

RISC-V: testsuite: fix an obvious build error

For a non-multilib build, I see following errors.

| FAIL: gcc.target/riscv/rvv/vtype-call-clobbered.c (test for excess errors)
| Excess errors:
| TC-INSTxyz/sysroot/usr/include/gnu/stubs.h:14:11: fatal error: 
gnu/stubs-lp64.h: No such file or directory compilation terminated.

The test selects non default ABI lp64 (vs. lp64d) for no real reason.
Fix that.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vtype-call-clobbered.c: Fix -mabi.

Signed-off-by: Vineet Gupta 

Diff:
---
 gcc/testsuite/gcc.target/riscv/rvv/vtype-call-clobbered.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vtype-call-clobbered.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vtype-call-clobbered.c
index be9f312aa508..78c8a4af8166 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/vtype-call-clobbered.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vtype-call-clobbered.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=rv64gcv -mabi=lp64 -O2" } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O2" } */
 /* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-Os" "-Oz" } } */
 
 #include "riscv_vector.h"


[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] [to-be-committed][RISC-V] Handle 32bit operands in condition for conditional moves

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:38f54b2523adf37f545c92b5dac751a421047731

commit 38f54b2523adf37f545c92b5dac751a421047731
Author: Jeff Law 
Date:   Sat Jun 7 07:48:46 2025 -0600

[to-be-committed][RISC-V] Handle 32bit operands in condition for 
conditional moves

So here's the next chunk of conditional move work from Shreya.

It's been a long standing wart that the conditional move expander does not
support sub-word operands in the comparison.  Particularly since we have
support routines to handle the necessary extensions for that case.

This patch adjusts the expander to use riscv_extend_comparands rather than 
fail
for that case.  I've built spec2017 before/after this and we definitely get
more conditional moves and they look sensible from a performance standpoint.
None are likely hitting terribly hot code, so I wouldn't expect any 
performance
jumps.

Waiting on pre-commit testing to do its thing.

* config/riscv/riscv.cc (riscv_expand_conditional_move): Use
riscv_extend_comparands to extend sub-word comparison arguments.

Co-authored-by: Jeff Law  

(cherry picked from commit 59a3da733a79f621700dd9ddc11a0efc07237c3a)

Diff:
---
 gcc/config/riscv/riscv.cc | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 99eeba64b6f9..dd29059412b1 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -5436,13 +5436,18 @@ riscv_expand_conditional_move (rtx dest, rtx op, rtx 
cons, rtx alt)
   machine_mode mode0 = GET_MODE (op0);
   machine_mode mode1 = GET_MODE (op1);
 
-  /* An integer comparison must be comparing WORD_MODE objects.  We
-must enforce that so that we don't strip away a sign_extension
-thinking it is unnecessary.  We might consider using
-riscv_extend_operands if they are not already properly extended.  */
+  /* An integer comparison must be comparing WORD_MODE objects.
+Extend the comparison arguments as necessary.  */
   if ((INTEGRAL_MODE_P (mode0) && mode0 != word_mode)
  || (INTEGRAL_MODE_P (mode1) && mode1 != word_mode))
-   return false;
+   riscv_extend_comparands (code, &op0, &op1);
+
+  /* We might have been handed back a SUBREG.  Just to make things
+easy, force it into a REG.  */
+  if (!REG_P (op0) && !CONST_INT_P (op0))
+   op0 = force_reg (word_mode, op0);
+  if (!REG_P (op1) && !CONST_INT_P (op1))
+   op1 = force_reg (word_mode, op1);
 
   /* In the fallback generic case use MODE rather than WORD_MODE for
 the output of the SCC instruction, to match the mode of the NEG


[gcc r16-1395] libstdc++: Make __max_size_type and __max_diff_type structural

2025-06-10 Thread Patrick Palka via Gcc-cvs
https://gcc.gnu.org/g:1f402fe23b0d4cf024688a729f4c86c37144d54a

commit r16-1395-g1f402fe23b0d4cf024688a729f4c86c37144d54a
Author: Patrick Palka 
Date:   Tue Jun 10 10:15:25 2025 -0400

libstdc++: Make __max_size_type and __max_diff_type structural

This patch makes these integer-class types structural types by
public-izing their data members so that they could be used as NTTP
types.  I don't think this is required by the standard, but it seems
like a useful extension.

libstdc++-v3/ChangeLog:

* include/bits/max_size_type.h (__max_size_type::_M_val): Make
public instead of private.
(__max_size_type::_M_msb): Likewise.
(__max_diff_type::_M_rep): Likewise.
* testsuite/std/ranges/iota/max_size_type.cc: Verify
__max_diff_type and __max_size_type are structural.

Reviewed-by: Tomasz Kamiński 
Reviewed-by: Jonathan Wakely 

Diff:
---
 libstdc++-v3/include/bits/max_size_type.h   | 4 ++--
 libstdc++-v3/testsuite/std/ranges/iota/max_size_type.cc | 7 +++
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/bits/max_size_type.h 
b/libstdc++-v3/include/bits/max_size_type.h
index 5bec0b5a519a..e602b1b4bee5 100644
--- a/libstdc++-v3/include/bits/max_size_type.h
+++ b/libstdc++-v3/include/bits/max_size_type.h
@@ -425,10 +425,11 @@ namespace ranges
   using __rep = unsigned long long;
 #endif
   static constexpr size_t _S_rep_bits = sizeof(__rep) * __CHAR_BIT__;
-private:
+
   __rep _M_val = 0;
   unsigned _M_msb:1 = 0;
 
+private:
   constexpr explicit
   __max_size_type(__rep __val, int __msb) noexcept
: _M_val(__val), _M_msb(__msb)
@@ -752,7 +753,6 @@ namespace ranges
   { return !(__l < __r); }
 #endif
 
-private:
   __max_size_type _M_rep = 0;
 
   friend class __max_size_type;
diff --git a/libstdc++-v3/testsuite/std/ranges/iota/max_size_type.cc 
b/libstdc++-v3/testsuite/std/ranges/iota/max_size_type.cc
index 3e6f954ceb0c..4739d9e2f790 100644
--- a/libstdc++-v3/testsuite/std/ranges/iota/max_size_type.cc
+++ b/libstdc++-v3/testsuite/std/ranges/iota/max_size_type.cc
@@ -400,6 +400,13 @@ static_assert(max_diff_t(max_size_t(1)
 << (numeric_limits::digits-1))
  == numeric_limits::min());
 
+// Verify that the types are structural types and can therefore be used
+// as NTTP types.
+template struct Su { static_assert(V*V == V+132); };
+template struct Ss { static_assert(V*V == V+132); };
+template struct Su<12>;
+template struct Ss<12>;
+
 int
 main()
 {


[gcc r16-1368] ada: Fix Value_Decimal to raise Constraint_Error on boundary values

2025-06-10 Thread Marc Poulhies via Gcc-cvs
https://gcc.gnu.org/g:5db0a4c3c736a2164774344c3c1b4c3b34e59a75

commit r16-1368-g5db0a4c3c736a2164774344c3c1b4c3b34e59a75
Author: Eric Botcazou 
Date:   Tue Mar 18 22:44:15 2025 +0100

ada: Fix Value_Decimal to raise Constraint_Error on boundary values

Even though the issue is not user-visible, it's a (minor) departure from the
specification of the procedure.

gcc/ada/ChangeLog:

* libgnat/s-valued.adb (Integer_to_Decimal): Add Extra parameter and
use its value to call Bad_Value on boundary values.
(Scan_Decimal): Adjust call to Integer_to_Decimal.
(Value_Decimal): Likewise.

Diff:
---
 gcc/ada/libgnat/s-valued.adb | 27 +++
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/gcc/ada/libgnat/s-valued.adb b/gcc/ada/libgnat/s-valued.adb
index dfef9a885e52..cc2cffc72a63 100644
--- a/gcc/ada/libgnat/s-valued.adb
+++ b/gcc/ada/libgnat/s-valued.adb
@@ -39,13 +39,15 @@ package body System.Value_D is
--  We need an unsigned type large enough to represent the mantissa
 
package Impl is new Value_R (Uns, 1, 2**(Int'Size - 1), Round => False);
-   --  We do not use the Extra digit for decimal fixed-point types
+   --  We do not use the Extra digit for decimal fixed-point types, except to
+   --  effectively ensure that overflow is detected near the boundaries.
 
function Integer_to_Decimal
  (Str: String;
   Val: Uns;
   Base   : Unsigned;
   ScaleB : Integer;
+  Extra  : Unsigned;
   Minus  : Boolean;
   Scale  : Integer) return Int;
--  Convert the real value from integer to decimal representation
@@ -59,6 +61,7 @@ package body System.Value_D is
   Val: Uns;
   Base   : Unsigned;
   ScaleB : Integer;
+  Extra  : Unsigned;
   Minus  : Boolean;
   Scale  : Integer) return Int
is
@@ -126,6 +129,10 @@ package body System.Value_D is
  end if;
   end Unsigned_To_Signed;
 
+  --  Local variables
+
+  E : Uns := Uns (Extra);
+
begin
   --  If the base of the value is 10 or its scaling factor is zero, then
   --  add the scales (they are defined in the opposite sense) and apply
@@ -143,9 +150,10 @@ package body System.Value_D is
 end loop;
 
 while S > 0 loop
-   if V <= Uns'Last / 10 then
-  V := V * 10;
+   if V <= (Uns'Last - E) / 10 then
+  V := V * 10 + E;
   S := S - 1;
+  E := 0;
else
   Bad_Value (Str);
end if;
@@ -193,8 +201,9 @@ package body System.Value_D is
   Z := 10 ** Integer'Max (0, -Scale);
 
   for J in 1 .. LS loop
- if V <= Uns'Last / Uns (B) then
-V := V * Uns (B);
+ if V <= (Uns'Last - E) / Uns (B) then
+V := V * Uns (B) + E;
+E := 0;
  else
 Bad_Value (Str);
  end if;
@@ -207,7 +216,7 @@ package body System.Value_D is
raise Program_Error;
 end if;
 
---  Perform a scale divide operation with rounding to match 'Image
+--  Perform a scaled divide operation with rounding to match 'Image
 
 Scaled_Divide (Unsigned_To_Signed (V), Y, Z, Q, R, Round => True);
 
@@ -238,7 +247,8 @@ package body System.Value_D is
begin
   Val := Impl.Scan_Raw_Real (Str, Ptr, Max, Base, Scl, Extra, Minus);
 
-  return Integer_to_Decimal (Str, Val (1), Base, Scl (1), Minus, Scale);
+  return
+Integer_to_Decimal (Str, Val (1), Base, Scl (1), Extra, Minus, Scale);
end Scan_Decimal;
 
---
@@ -255,7 +265,8 @@ package body System.Value_D is
begin
   Val := Impl.Value_Raw_Real (Str, Base, Scl, Extra, Minus);
 
-  return Integer_to_Decimal (Str, Val (1), Base, Scl (1), Minus, Scale);
+  return
+Integer_to_Decimal (Str, Val (1), Base, Scl (1), Extra, Minus, Scale);
end Value_Decimal;
 
 end System.Value_D;


[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] RISC-V: frm/mode-switch: remove TARGET_MODE_CONFLUENCE

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:2b418bd916ce1d993c66a4658e4e7804d9256c45

commit 2b418bd916ce1d993c66a4658e4e7804d9256c45
Author: Vineet Gupta 
Date:   Sun Jun 8 14:44:29 2025 -0700

RISC-V: frm/mode-switch: remove TARGET_MODE_CONFLUENCE

This is effectively reverting e5d1f538bb7d
"(RISC-V: Allow different dynamic floating point mode to be merged)"
while retaining the testcase.

The change itself is valid, however it obfuscates the deficiencies in
current frm mode switching code.

Also for a SPEC2017 -Ofast -march=rv64gcv build, it ends up generating
net more FRM restores (writes) vs. the rest of this changeset.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_dynamic_frm_mode_p): Remove.
(riscv_mode_confluence): Ditto.
(TARGET_MODE_CONFLUENCE): Ditto.

Signed-off-by: Vineet Gupta 
(cherry picked from commit ac0fea67b9591197a3f21dd4fb924d87cc559e7e)

Diff:
---
 gcc/config/riscv/riscv.cc | 37 -
 1 file changed, 37 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index aa8cd97b3102..d032578f19a4 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -12485,41 +12485,6 @@ riscv_mode_needed (int entity, rtx_insn *insn, 
HARD_REG_SET)
 }
 }
 
-/* Return TRUE if the rouding mode is dynamic.  */
-
-static bool
-riscv_dynamic_frm_mode_p (int mode)
-{
-  return mode == riscv_vector::FRM_DYN
-|| mode == riscv_vector::FRM_DYN_CALL
-|| mode == riscv_vector::FRM_DYN_EXIT;
-}
-
-/* Implement TARGET_MODE_CONFLUENCE.  */
-
-static int
-riscv_mode_confluence (int entity, int mode1, int mode2)
-{
-  switch (entity)
-{
-case RISCV_VXRM:
-  return VXRM_MODE_NONE;
-case RISCV_FRM:
-  {
-   /* FRM_DYN, FRM_DYN_CALL and FRM_DYN_EXIT are all compatible.
-  Although we already try to set the mode needed to FRM_DYN after a
-  function call, there are still some corner cases where both FRM_DYN
-  and FRM_DYN_CALL may appear on incoming edges.  */
-   if (riscv_dynamic_frm_mode_p (mode1)
-   && riscv_dynamic_frm_mode_p (mode2))
- return riscv_vector::FRM_DYN;
-   return riscv_vector::FRM_NONE;
-  }
-default:
-  gcc_unreachable ();
-}
-}
-
 /* Return TRUE that an insn is asm.  */
 
 static bool
@@ -15123,8 +15088,6 @@ synthesize_and (rtx operands[3])
 #define TARGET_MODE_EMIT riscv_emit_mode_set
 #undef TARGET_MODE_NEEDED
 #define TARGET_MODE_NEEDED riscv_mode_needed
-#undef TARGET_MODE_CONFLUENCE
-#define TARGET_MODE_CONFLUENCE riscv_mode_confluence
 #undef TARGET_MODE_AFTER
 #define TARGET_MODE_AFTER riscv_mode_after
 #undef TARGET_MODE_ENTRY


[gcc r16-1396] cobol: Variety of small changes in answer to cppcheck diagnostics.

2025-06-10 Thread James K. Lowden via Gcc-cvs
https://gcc.gnu.org/g:70c3dd9a81cdefcaf24a66ec0c1ceddf5d3984dd

commit r16-1396-g70c3dd9a81cdefcaf24a66ec0c1ceddf5d3984dd
Author: James K. Lowden 
Date:   Tue Jun 10 10:34:28 2025 -0400

cobol: Variety of small changes in answer to cppcheck diagnostics.

Remove non-ASCII input and blank lines from gcobol.1. Restrict
cobol.clean target to compiler object files.

gcc/cobol/ChangeLog:

* Make-lang.in: cobol.clean does not remove libgcobol files.
* cdf.y: Suppress 1 cppcheck false positive.
* cdfval.h (scanner_parsing):  Partial via cppcheck for PR119324.
* gcobol.1: Fix groff errors.
* gcobolspec.cc (append_arg): Const parameter.
* parse_ante.h (intrinsic_call_2): Avoid NULL dereference.

Diff:
---
 gcc/cobol/Make-lang.in  |  3 +--
 gcc/cobol/cdf.y |  1 +
 gcc/cobol/cdfval.h  |  8 
 gcc/cobol/gcobol.1  | 12 +---
 gcc/cobol/gcobolspec.cc |  2 +-
 gcc/cobol/parse_ante.h  |  2 +-
 6 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/gcc/cobol/Make-lang.in b/gcc/cobol/Make-lang.in
index 993e4c6ffb02..5f293e1f9874 100644
--- a/gcc/cobol/Make-lang.in
+++ b/gcc/cobol/Make-lang.in
@@ -351,8 +351,7 @@ cobol.srcman:
 cobol.mostlyclean:
 
 cobol.clean:
-   rm -fr gcobol cobol1 cobol/*\
-   ../*/libgcobol/*
+   rm -fr gcobol cobol1 cobol/*
 
 cobol.distclean:
 
diff --git a/gcc/cobol/cdf.y b/gcc/cobol/cdf.y
index 0440d0216af9..e4d2feaaf52e 100644
--- a/gcc/cobol/cdf.y
+++ b/gcc/cobol/cdf.y
@@ -891,6 +891,7 @@ verify_integer( const YDFLTYPE& loc, const cdfval_base_t& 
val ) {
   return true;
 }
 
+// cppcheck-suppress returnTempReference
 const cdfval_base_t&
 cdfval_base_t::operator()( const YDFLTYPE& loc ) {
   static cdfval_t zero(0);
diff --git a/gcc/cobol/cdfval.h b/gcc/cobol/cdfval.h
index 09c21ab08e50..c4387b080661 100644
--- a/gcc/cobol/cdfval.h
+++ b/gcc/cobol/cdfval.h
@@ -38,6 +38,14 @@
 
 bool scanner_parsing();
 
+/* cdfval_base_t has no constructor because otherwise: 
+ * cobol/cdf.h:172:7: note: ‘YDFSTYPE::YDFSTYPE()’ is implicitly deleted 
+ *  because the default definition would be ill-formed:
+ * 172 | union YDFSTYPE
+ * 
+ * We use the derived type cdfval_t, which can be properly constructed and
+ * operated on, but tell Bison only about its POD base class.
+ */
 struct YDFLTYPE;
 struct cdfval_base_t {
   bool off;
diff --git a/gcc/cobol/gcobol.1 b/gcc/cobol/gcobol.1
index 0ce890e97229..6db54009fcf7 100644
--- a/gcc/cobol/gcobol.1
+++ b/gcc/cobol/gcobol.1
@@ -39,7 +39,7 @@ compiles \*[lang] source code to object code, and optionally 
produces an
 executable binary or shared object.  As a GCC component, it accepts
 all options that affect code-generation and linking.  Options specific
 to \*[lang] are listed below.
-.Bl -tag -width \0\0debug
+.Bl -tag -width "\0\0debug"
 .It Fl main Ar filename
 .Nm
 will generate a
@@ -197,14 +197,12 @@ Otherwise, columns 1-6 are examined. If those characters 
are all digits
 or blanks, the file is assumed to be in
 .Em "fixed-form reference format",
 also with the indicator in column 7.
-
 If not auto-detected as 
 .Em "fixed-form reference format" 
 or 
 .Em "extended source format",
 the file is assumed to be in 
 .Em "free-form reference format".
-
 .Pp
 .
 .It Fl fcobol-exceptions Ar exception Op Ns , Ns Ar exception Ns ...
@@ -1088,7 +1086,7 @@ the directive must appear before
 .Pp
 To test a feature-set variable, use
 .Dl >>IF Ar feature Li DEFINED
-..
+.
 .Ss Copybooks
 .Nm
 supports the CDF
@@ -1294,7 +1292,7 @@ stores and converts
 numbers.  Converting the floating-point value to the numeric display
 value 0055110 is done by multiplying 55.10...\& by 1,000 and then
 truncating the result to an integer.  And it turns out that even
-though 55.11 can’t be represented in floating-point as an exact value,
+though 55.11 can't be represented in floating-point as an exact value,
 the product of the multiplication, 55110, is an exact value.
 .Pp
 In cases where it is important for conversions to have predictable
@@ -1325,7 +1323,7 @@ specified for a calculation, then the intermediate result 
becomes a
 .
 .Ss A warning about binary floating point comparison
 The cardinal rule when doing comparisons involving floating-point
-values: Never, ever, test for equality.  It’s just not worth the hassle.
+values: Never, ever, test for equality.  It's just not worth the hassle.
 .Pp
 For example:
 .Bd -literal
@@ -1361,7 +1359,7 @@ and you really test the code.  And then avoid it anyway.
 .Pp
 Finally, it is observably the case that the
 .Nm
-implementations of floating-point conversions and comparisons don’t
+implementations of floating-point conversions and comparisons don't
 precisely match the behavior of other \*[lang] compilers.
 .Pp
 You have been warned.
diff --git a/gcc/cobol/gcobolspec.cc b/gcc/cobol/gcobolspec.cc
index d1ffc97f8ca5..70784d7e3570 100644
--- a/gcc/cobol/gcobolspec.cc
+++ b/gcc/cobol/

[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] [RISC-V] Handle 32bit operands in condition for conditional moves

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:e47d4027be18683eb090b3601e50c854a6c60c5b

commit e47d4027be18683eb090b3601e50c854a6c60c5b
Author: Shreya Munnangi 
Date:   Sun Jun 8 08:42:53 2025 -0600

[RISC-V] Handle 32bit operands in condition for conditional moves

So here's the next chunk of conditional move work from Shreya.

It's been a long standing wart that the conditional move expander does not
support sub-word operands in the comparison.  Particularly since we have
support routines to handle the necessary extensions for that case.

This patch adjusts the expander to use riscv_extend_comparands rather than 
fail
for that case.  I've built spec2017 before/after this and we definitely get
more conditional moves and they look sensible from a performance standpoint.
None are likely hitting terribly hot code, so I wouldn't expect any 
performance
jumps.

Waiting on pre-commit testing to do its thing.

gcc/
* config/riscv/riscv.cc (riscv_expand_conditional_move): Use
riscv_extend_comparands to extend sub-word comparison arguments.

Co-authored-by: Jeff Law  

(cherry picked from commit 2523c15430d980c380684c3df49f9ae016b8647d)

Diff:
---
 gcc/config/riscv/riscv.cc | 141 ++
 1 file changed, 79 insertions(+), 62 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index dd29059412b1..aa8cd97b3102 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -5389,11 +5389,18 @@ riscv_expand_conditional_branch (rtx label, rtx_code 
code, rtx op0, rtx op1)
 bool
 riscv_expand_conditional_move (rtx dest, rtx op, rtx cons, rtx alt)
 {
-  machine_mode mode = GET_MODE (dest);
+  machine_mode dst_mode = GET_MODE (dest);
+  machine_mode cond_mode = GET_MODE (dest);
   rtx_code code = GET_CODE (op);
   rtx op0 = XEXP (op, 0);
   rtx op1 = XEXP (op, 1);
 
+  /* General note.  This is called from the conditional move
+ expander.  That simplifies the cases we need to worry about
+ as we know the destination will have the same mode as the
+ true/false arms.  Furthermore we know that mode will be
+ DI/SI for rv64 or SI for rv32.  */
+
   /* For some tests, we can easily construct a 0, -1 value
  which can then be used to synthesize more efficient
  sequences that don't use zicond.  */
@@ -5416,12 +5423,12 @@ riscv_expand_conditional_move (rtx dest, rtx op, rtx 
cons, rtx alt)
 not a constant, then avoid zicond as more efficient sequences
 using the splatted sign bit are often possible.  */
   if (CONST_INT_P (alt)
- && alt != CONST0_RTX (mode)
+ && alt != CONST0_RTX (dst_mode)
  && !CONST_INT_P (cons))
return false;
 
   if (CONST_INT_P (cons)
- && cons != CONST0_RTX (mode)
+ && cons != CONST0_RTX (dst_mode)
  && !CONST_INT_P (alt))
return false;
 
@@ -5429,8 +5436,9 @@ riscv_expand_conditional_move (rtx dest, rtx op, rtx 
cons, rtx alt)
 }
 
   if (((TARGET_ZICOND_LIKE
-   || (arith_operand (cons, mode) && arith_operand (alt, mode)))
-   && (GET_MODE_CLASS (mode) == MODE_INT))
+   || (arith_operand (cons, dst_mode) && arith_operand (alt, dst_mode)))
+   && GET_MODE_CLASS (dst_mode) == MODE_INT
+   && GET_MODE_CLASS (cond_mode) == MODE_INT)
   || TARGET_SFB_ALU || TARGET_XTHEADCONDMOV)
 {
   machine_mode mode0 = GET_MODE (op0);
@@ -5449,13 +5457,13 @@ riscv_expand_conditional_move (rtx dest, rtx op, rtx 
cons, rtx alt)
   if (!REG_P (op1) && !CONST_INT_P (op1))
op1 = force_reg (word_mode, op1);
 
-  /* In the fallback generic case use MODE rather than WORD_MODE for
-the output of the SCC instruction, to match the mode of the NEG
+  /* In the fallback generic case use DST_MODE rather than WORD_MODE
+for the output of the SCC instruction, to match the mode of the NEG
 operation below.  The output of SCC is 0 or 1 boolean, so it is
 valid for input in any scalar integer mode.  */
   rtx tmp = gen_reg_rtx ((TARGET_ZICOND_LIKE
  || TARGET_SFB_ALU || TARGET_XTHEADCONDMOV)
-? word_mode : mode);
+? word_mode : dst_mode);
   bool invert = false;
 
   /* Canonicalize the comparison.  It must be an equality comparison
@@ -5484,7 +5492,7 @@ riscv_expand_conditional_move (rtx dest, rtx op, rtx 
cons, rtx alt)
  else
return false;
 
- op = gen_rtx_fmt_ee (invert ? EQ : NE, mode, tmp, const0_rtx);
+ op = gen_rtx_fmt_ee (invert ? EQ : NE, cond_mode, tmp, const0_rtx);
 
  /* We've generated a new comparison.  Update the local variables.  */
  code = GET_CODE (op);
@@ -5503,10 +5511,10 @@ riscv_expand_conditional_move (rtx dest, rtx op, rtx 
cons, rtx alt)
 arm of the conditional move.  That allows us to sup

[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] RISC-V: Reconcile the existing test for vdivu.vx combine

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:f633e6fe9a173e001890629d659e2ecd68be2fea

commit f633e6fe9a173e001890629d659e2ecd68be2fea
Author: Pan Li 
Date:   Fri Jun 6 10:03:50 2025 +0800

RISC-V: Reconcile the existing test for vdivu.vx combine

Some existing vdiv related test need some adjust for the
asm check due to cost model.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Adjust
the asm check for vdivu.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv-nofm.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv.c: Ditto.

Signed-off-by: Pan Li 
(cherry picked from commit 08a0b6dabd76c8ca4366a59c2fdcd1ef8f8b1cb9)

Diff:
---
 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c | 4 ++--
 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c  | 4 ++--
 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv-nofm.c | 4 ++--
 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv.c  | 4 ++--
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c
index 4685ed22a784..a8be5edcc70c 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c
@@ -5,8 +5,8 @@
 
 /* { dg-final { scan-assembler-times {\tvdiv\.vv} 8 } } */
 /* { dg-final { scan-assembler-not {\tvdiv\.vx} } } */
-/* { dg-final { scan-assembler-times {\tvdivu\.vv} 5 } } */
-/* { dg-final { scan-assembler-times {\tvdivu\.vx} 3 } } */
+/* { dg-final { scan-assembler-times {\tvdivu\.vv} 8 } } */
+/* { dg-final { scan-assembler-not {\tvdivu\.vx} } } */
 
 /* { dg-final { scan-assembler-times {\tvfdiv\.vv} 6 } } */
 /* { dg-final { scan-assembler-not {\tvfdiv\.vf} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c
index 59c48d2d9bae..7feee0ec154a 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c
@@ -5,8 +5,8 @@
 
 /* { dg-final { scan-assembler-times {\tvdiv\.vv} 8 } } */
 /* { dg-final { scan-assembler-not {\tvdiv\.vx} } } */
-/* { dg-final { scan-assembler-times {\tvdivu\.vv} 5 } } */
-/* { dg-final { scan-assembler-times {\tvdivu\.vx} 3 } } */
+/* { dg-final { scan-assembler-times {\tvdivu\.vv} 8 } } */
+/* { dg-final { scan-assembler-not {\tvdivu\.vx} } } */
 
 /* Division by constant is done by calculating a reciprocal and
then multiplying.  Hence we do not expect 6 vfdivs.  */
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv-nofm.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv-nofm.c
index b574dc42182c..766b17fc37da 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv-nofm.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv-nofm.c
@@ -5,8 +5,8 @@
 
 /* { dg-final { scan-assembler-times {\tvdiv\.vv} 8 } } */
 /* { dg-final { scan-assembler-not {\tvdiv\.vx} } } */
-/* { dg-final { scan-assembler-times {\tvdivu\.vv} 4 } } */
-/* { dg-final { scan-assembler-times {\tvdivu\.vx} 4 } } */
+/* { dg-final { scan-assembler-times {\tvdivu\.vv} 8 } } */
+/* { dg-final { scan-assembler-not {\tvdivu\.vx} } } */
 
 /* { dg-final { scan-assembler-times {\tvfdiv\.vv} 6 } } */
 /* { dg-final { scan-assembler-not {\tvfdiv\.vf} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv.c
index 9b46c6be0efb..c59c66439f89 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv.c
@@ -5,8 +5,8 @@
 
 /* { dg-final { scan-assembler-times {\tvdiv\.vv} 8 } } */
 /* { dg-final { scan-assembler-not {\tvdiv\.vx} } } */
-/* { dg-final { scan-assembler-times {\tvdivu\.vv} 4 } } */
-/* { dg-final { scan-assembler-times {\tvdivu\.vx} 4 } } */
+/* { dg-final { scan-assembler-times {\tvdivu\.vv} 8 } } */
+/* { dg-final { scan-assembler-not {\tvdivu\.vx} } } */
 
 /* Division by constant is done by calculating a reciprocal and
then multiplying.  Hence we do not expect 6 vfdivs.  */


[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] RISC-V: Add test for vec_duplicate + vdivu.vv combine case 0 with GR2VR cost 0, 2 and 15

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:f34bc1ecce268061e8ad8ef7248203a50ecd0036

commit f34bc1ecce268061e8ad8ef7248203a50ecd0036
Author: Pan Li 
Date:   Fri Jun 6 09:49:56 2025 +0800

RISC-V: Add test for vec_duplicate + vdivu.vv combine case 0 with GR2VR 
cost 0, 2 and 15

Add asm dump check test for vec_duplicate + vdivu.vv combine to vdivu.vx,
with the GR2VR cost is 0, 2 and 15.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check
for vdivu.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test
data for run test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vdiv-run-1-u16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vdiv-run-1-u32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vdiv-run-1-u64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vdiv-run-1-u8.c: New test.

Signed-off-by: Pan Li 
(cherry picked from commit 2ca7622fd7b32fd538edea8fd8bd8b97ba07ef16)

Diff:
---
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u8.c   |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u16.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u32.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u64.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u8.c   |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u16.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u32.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u64.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u8.c   |   2 +
 .../riscv/rvv/autovec/vx_vf/vx_binary_data.h   | 196 +
 .../riscv/rvv/autovec/vx_vf/vx_vdiv-run-1-u16.c|  15 ++
 .../riscv/rvv/autovec/vx_vf/vx_vdiv-run-1-u32.c|  15 ++
 .../riscv/rvv/autovec/vx_vf/vx_vdiv-run-1-u64.c|  15 ++
 .../riscv/rvv/autovec/vx_vf/vx_vdiv-run-1-u8.c |  15 ++
 17 files changed, 280 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c
index 7e107d30191d..92fbf227d563 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c
@@ -11,6 +11,7 @@ DEF_VX_BINARY_REVERSE_CASE_0_WRAP(T, -, rsub);
 DEF_VX_BINARY_CASE_0_WRAP(T, &, and)
 DEF_VX_BINARY_CASE_0_WRAP(T, |, or)
 DEF_VX_BINARY_CASE_0_WRAP(T, ^, xor)
+DEF_VX_BINARY_CASE_0_WRAP(T, /, div)
 
 /* { dg-final { scan-assembler-times {vadd.vx} 1 } } */
 /* { dg-final { scan-assembler-times {vsub.vx} 1 } } */
@@ -18,3 +19,4 @@ DEF_VX_BINARY_CASE_0_WRAP(T, ^, xor)
 /* { dg-final { scan-assembler-times {vand.vx} 1 } } */
 /* { dg-final { scan-assembler-times {vor.vx} 1 } } */
 /* { dg-final { scan-assembler-times {vxor.vx} 1 } } */
+/* { dg-final { scan-assembler-times {vdivu.vx} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c
index f8ffab78067a..f487b42820ee 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c
@@ -11,6 +11,7 @@ DEF_VX_BINARY_REVERSE_CASE_0_WRAP(T, -, rsub);
 DEF_VX_BINARY_CASE_0_WRAP(T, &, and)
 DEF_VX_BINARY_CASE_0_WRAP(T, |, or)
 DEF_VX_BINARY_CASE_0_WRAP(T, ^, xor)
+DEF_VX_BINARY_CASE_0_WRAP(T, /, div)
 
 /* { dg-final { scan-assembler-times {vadd.vx} 1 } } */
 /* { dg-final { scan-assembler-times {vsub.vx} 1 } } */
@@ -18,3 +19,4 @@ DEF_VX_BINARY_CASE_0_WRAP(T, ^, xor)
 /* { dg-final { scan-assembler-times {vand.vx} 1 } } */
 /* { dg-final { scan-assembler-times {vor.vx} 1 } } */
 /* { dg-final { scan-assembler-times {vxor.vx} 1 } } */
+/* { dg-final { scan-assembler-times {vdivu.vx} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c
index 31d294567e83..761d25c0833a 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c
+++ b/gcc/

[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] RISC-V: Support -mcpu for XiangShan Kunminghu cpu.

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:523071dcf5e5abecc50f96b24ad3d35dadbbe7cd

commit 523071dcf5e5abecc50f96b24ad3d35dadbbe7cd
Author: Jiawei 
Date:   Wed Jun 4 17:56:49 2025 +0800

RISC-V: Support -mcpu for XiangShan Kunminghu cpu.

This patch adds support for the XiangShan Kunminghu CPU in GCC, allowing
the use of the `-mcpu=xiangshan-kunminghu` option.

XiangShan-KunMingHu is the third-generation open-source high-performance
RISC-V processor.[1] You can find the corresponding ISA extension from the
XiangShan Github repository.[2] The latest news of KunMingHu can be found
in the XiangShan Biweekly.[3]

[1] https://github.com/OpenXiangShan/XiangShan-User-Guide/releases.
[2] 
https://github.com/OpenXiangShan/XiangShan/blob/master/src/main/scala/xiangshan/Parameters.scala
[3] https://docs.xiangshan.cc/zh-cn/latest/blog

A dedicated scheduling model for KunMingHu's hybrid pipeline will be
proposed in a subsequent PR.

gcc/ChangeLog:

* config/riscv/riscv-cores.def (RISCV_TUNE): New cpu tune.
(RISCV_CORE): New cpu.
* doc/invoke.texi: Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/mcpu-xiangshan-kunminghu.c: New test.

Co-Authored-By: Jiawei Chen 
Co-Authored-By: Yangyu Chen 
Co-Authored-By: Tang Haojin 
(cherry picked from commit f0cd40f71ba424bde94dcddbf1df67bb100b82ef)

Diff:
---
 gcc/config/riscv/riscv-cores.def   | 14 
 gcc/doc/invoke.texi|  4 +-
 .../gcc.target/riscv/mcpu-xiangshan-kunminghu.c| 95 ++
 3 files changed, 111 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/riscv-cores.def b/gcc/config/riscv/riscv-cores.def
index 118fef23cad4..cff7c77a0bd7 100644
--- a/gcc/config/riscv/riscv-cores.def
+++ b/gcc/config/riscv/riscv-cores.def
@@ -48,6 +48,7 @@ RISCV_TUNE("xt-c910v2", generic, generic_ooo_tune_info)
 RISCV_TUNE("xt-c920", generic, generic_ooo_tune_info)
 RISCV_TUNE("xt-c920v2", generic, generic_ooo_tune_info)
 RISCV_TUNE("xiangshan-nanhu", xiangshan, xiangshan_nanhu_tune_info)
+RISCV_TUNE("xiangshan-kunminghu", xiangshan, generic_ooo_tune_info)
 RISCV_TUNE("generic-ooo", generic_ooo, generic_ooo_tune_info)
 RISCV_TUNE("size", generic, optimize_size_tune_info)
 RISCV_TUNE("mips-p8700", mips_p8700, mips_p8700_tune_info)
@@ -154,6 +155,19 @@ RISCV_CORE("xiangshan-nanhu",  
"rv64imafdc_zba_zbb_zbc_zbs_"
  "svinval_zicbom_zicboz",
  "xiangshan-nanhu")
 
+RISCV_CORE("xiangshan-kunminghu",   "rv64imafdcbvh_sdtrig_sha_shcounterenw_"
+ 
"shgatpa_shlcofideleg_shtvala_shvsatpa_shvstvala_shvstvecd_"
+ 
"smaia_smcsrind_smdbltrp_smmpm_smnpm_smrnmi_smstateen_"
+ 
"ssaia_ssccptr_sscofpmf_sscounterenw_sscsrind_ssdbltrp_"
+ 
"ssnpm_sspm_ssstateen_ssstrict_sstc_sstvala_sstvecd_"
+ 
"ssu64xl_supm_svade_svbare_svinval_svnapot_svpbmt_za64rs_"
+ 
"zacas_zawrs_zba_zbb_zbc_zbkb_zbkc_zbkx_zbs_zcb_zcmop_"
+ 
"zfa_zfh_zfhmin_zic64b_zicbom_zicbop_zicboz_ziccif_"
+ 
"zicclsm_ziccrse_zicntr_zicond_zicsr_zifencei_zihintpause_"
+ 
"zihpm_zimop_zkn_zknd_zkne_zknh_zksed_zksh_zkt_zvbb_zvfh_"
+ "zvfhmin_zvkt_zvl128b_zvl32b_zvl64b",
+ "xiangshan-kunminghu")
+
 RISCV_CORE("mips-p8700",   "rv64imafd_zicsr_zmmul_"
  "zaamo_zalrsc_zba_zbb",
  "mips-p8700")
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index cfcb4b3cf978..ab49ba00c0a7 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -31123,8 +31123,8 @@ Permissible values for this option are: 
@samp{mips-p8700}, @samp{sifive-e20},
 @samp{sifive-e76}, @samp{sifive-s21}, @samp{sifive-s51}, @samp{sifive-s54},
 @samp{sifive-s76}, @samp{sifive-u54}, @samp{sifive-u74}, @samp{sifive-x280},
 @samp{sifive-xp450}, @samp{sifive-x670}, @samp{thead-c906}, 
@samp{tt-ascalon-d8},
-@samp{xiangshan-nanhu}, @samp{xt-c908}, @samp{xt-c908v}, @samp{xt-c910}, 
@samp{xt-c910v2},
-@samp{xt-c920}, @samp{xt-c920v2}.
+@samp{xiangshan-nanhu}, @samp{xiangshan-kunminghu}, @samp{xt-c908}, 
@samp{xt-c908v},
+@samp{xt-c910}, @samp{xt-c910v2}, @samp{xt-c920}, @samp{xt-c920v2}.
 
 Note that @option{-mcpu} does not override @option{-march} or @option{-mtune}.
 
diff --git a/gcc/testsuite/gcc.target/riscv/mcpu-xiangshan-kunminghu.c 
b/gcc/testsuite/gcc.target/riscv/mcpu-xiangshan-kunminghu.c
new file mode 100644
index ..e3ae65c46444
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/mcpu-xiangshan-kunminghu.c
@@ -0,0 +1,95 @@
+/* { dg-do compile { target { rv64 } } } */
+/* { dg-skip-if "-march given" { *-*-* } { "-march=*" } } */
+/* { dg-

[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] RISC-V: Combine vec_duplicate + vidvu.vv to vdivu.vx on GR2VR cost

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:7e0e5701d71b649ed911a5c236af8f8fd714bbc4

commit 7e0e5701d71b649ed911a5c236af8f8fd714bbc4
Author: Pan Li 
Date:   Fri Jun 6 09:33:21 2025 +0800

RISC-V: Combine vec_duplicate + vidvu.vv to vdivu.vx on GR2VR cost

This patch would like to combine the vec_duplicate + vdivu.vv to the
vdivu.vx.  From example as below code.  The related pattern will depend
on the cost of vec_duplicate from GR2VR.  Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.

Assume we have example code like below, GR2VR cost is 0.

  #define DEF_VX_BINARY(T, OP)\
  void\
  test_vx_binary (T * restrict out, T * restrict in, T x, unsigned n) \
  {   \
for (unsigned i = 0; i < n; i++)  \
  out[i] = in[i] OP x;\
  }

  DEF_VX_BINARY(int32_t, /)

Before this patch:
  10   │ test_vx_binary_or_int32_t_case_0:
  11   │ beq a3,zero,.L8
  12   │ vsetvli a5,zero,e32,m1,ta,ma
  13   │ vmv.v.x v2,a2
  14   │ sllia3,a3,32
  15   │ srlia3,a3,32
  16   │ .L3:
  17   │ vsetvli a5,a3,e32,m1,ta,ma
  18   │ vle32.v v1,0(a1)
  19   │ sllia4,a5,2
  20   │ sub a3,a3,a5
  21   │ add a1,a1,a4
  22   │ vdivu.vv v1,v1,v2
  23   │ vse32.v v1,0(a0)
  24   │ add a0,a0,a4
  25   │ bne a3,zero,.L3

After this patch:
  10   │ test_vx_binary_or_int32_t_case_0:
  11   │ beq a3,zero,.L8
  12   │ sllia3,a3,32
  13   │ srlia3,a3,32
  14   │ .L3:
  15   │ vsetvli a5,a3,e32,m1,ta,ma
  16   │ vle32.v v1,0(a1)
  17   │ sllia4,a5,2
  18   │ sub a3,a3,a5
  19   │ add a1,a1,a4
  20   │ vdivu.vx v1,v1,a2
  21   │ vse32.v v1,0(a0)
  22   │ add a0,a0,a4
  23   │ bne a3,zero,.L3

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_vx_binary_vec_vec_dup): Add new
case UDIV.
* config/riscv/riscv.cc (riscv_rtx_costs): Ditto.
* config/riscv/vector-iterators.md: Add new op divu.

Signed-off-by: Pan Li 
(cherry picked from commit be205ec675ed79275e694dda90f0f97fc6ac0e7a)

Diff:
---
 gcc/config/riscv/riscv-v.cc  | 1 +
 gcc/config/riscv/riscv.cc| 1 +
 gcc/config/riscv/vector-iterators.md | 2 +-
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index a41317f322f7..6a7eb7161b37 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -5568,6 +5568,7 @@ expand_vx_binary_vec_vec_dup (rtx op_0, rtx op_1, rtx 
op_2,
 case XOR:
 case MULT:
 case DIV:
+case UDIV:
   icode = code_for_pred_scalar (code, mode);
   break;
 default:
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 413eae05f4c9..99eeba64b6f9 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3943,6 +3943,7 @@ riscv_rtx_costs (rtx x, machine_mode mode, int 
outer_code, int opno ATTRIBUTE_UN
  switch (GET_CODE (op))
{
case DIV:
+   case UDIV:
  *total = get_vector_binary_rtx_cost (op, scalar2vr_cost);
  break;
default:
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index 86f31f3afabb..36301b0be6e7 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -4042,7 +4042,7 @@
 ])
 
 (define_code_iterator any_int_binop_no_shift_v_vdup [
-  plus minus and ior xor mult div
+  plus minus and ior xor mult div udiv
 ])
 
 (define_code_iterator any_int_binop_no_shift_vdup_v [


[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] RISC-V: Regen riscv-ext.texi [NFC]

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:e3ada8eb1b917128350fc45c00092b1687506e41

commit e3ada8eb1b917128350fc45c00092b1687506e41
Author: Kito Cheng 
Date:   Tue Jun 10 10:32:37 2025 +0800

RISC-V: Regen riscv-ext.texi [NFC]

Regenerates the `riscv-ext.texi` file in the GCC documentation.

gcc/ChangeLog:

* doc/riscv-ext.texi: Regen.

(cherry picked from commit fce42e15063e76d2535586a25d560e2340ca8ac9)

Diff:
---
 gcc/doc/riscv-ext.texi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/doc/riscv-ext.texi b/gcc/doc/riscv-ext.texi
index e69a2df768d4..c3ed1bfb5936 100644
--- a/gcc/doc/riscv-ext.texi
+++ b/gcc/doc/riscv-ext.texi
@@ -520,7 +520,7 @@
 
 @item smrnmi
 @tab 1.0
-@tab Resumable Non-Maskable Interrupts
+@tab Resumable non-maskable interrupts
 
 @item smstateen
 @tab 1.0


[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] RISC-V: frm/mode-switch: robustify call_insn backtracking [PR120203]

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:3458870cad987cf3e3bf5add23399f1ca285

commit 3458870cad987cf3e3bf5add23399f1ca285
Author: Vineet Gupta 
Date:   Sun Jun 8 14:55:11 2025 -0700

RISC-V: frm/mode-switch: robustify call_insn backtracking [PR120203]

As described in prior patches of this series, FRM mode switching state
machine has special handling around calls. After a call_insn, if in DYN_CALL
state, it needs to transition back to DYN, which requires back checking
if prev insn was indeed a call.

Defering/delaying this could lead to unncessary final transitions leading
to extraenous FRM save/restores.

However the current back checking of call_insn was too coarse-grained.
It used prev_nonnote_nondebug_insn_bb () which implies current insn to
be in the same BB as the call_insn, which need not always be true.
The problem is not with the API, but the use thereof.

Fix this by tracking call_insn more explicitly in TARGET_MODE_NEEDED.
 - On seeing a call_insn, record a "call note".
 - On subsequent insns if a "call note" is seen, do the needed state switch
   and clear the note.
 - Remove the old BB based search.

The number of FRM read/writes across SPEC2017 -Ofast -mrv64gcv improves.

   Before After
   ----
  frrm fsrmi fsrm   frrm fsrmi frrm
   perlbench_r  1701  1701
  cpugcc_r  1100  1100
  bwaves_r  1601  1601
 mcf_r  1100  1100
  cactusBSSN_r  1901  1901
namd_r  1401  1401
  parest_r  2401  2401
  povray_r  2616  2616
 lbm_r   600   600
 omnetpp_r  1701  1701
 wrf_r1268   13 1603 613   13   82
cpuxalan_r  1701  1701
  ldecod_r  1100  1100
x264_r  1100  1100
 blender_r  61   12   42  39   12   16
cam4_r  45   13   20  40   13   17
   deepsjeng_r  1100  1100
 imagick_r 132   16   25  33   16   18
   leela_r  1200  1200
 nab_r  1301  1301
   exchange2_r  1601  1601
   fotonik3d_r  1901  1901
roms_r  2101  2101
  xz_r   600   600
   ---
  1804   55 17071023   55  150
   ---
3566  1228
   ---

While this was a missed-optimization exercise, testing exposed a latent
bug as additional testsuite failure, captured as PR120203. The existing
test float-point-dynamic-frm-74.c was missing FRM save after a call
which this fixes (as a side-effect of robust call state tracking).

|frrma5
|fsrmi   1
|
|vfadd.vv v1,v8,v9
|fsrma5
|beq a1,zero,.L2
|
|callnormalize_vl_1
|frrma5
|
| .L3:
|fsrmi   3
|vfadd.vv v8,v8,v9
|fsrma5
|jr  ra
|
| .L2:
|callnormalize_vl_2
|frrma5   <-- missing
|j   .L3

PR target/120203

gcc/ChangeLog:

* config/riscv/riscv.cc (CFUN_IN_CALL): New macro.
(struct mode_switching_info): Add new field.
(riscv_frm_adjust_mode_after_call): Remove.
(riscv_frm_mode_needed): Track call_insn.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-dynamic-frm-74.c: Expect
an additional FRRM.

Signed-off-by: Vineet Gupta 
(cherry picked from commit fd042192094c456e275c53dfe92383bec1e9fca3)

Diff:
---
 gcc/config/riscv/riscv.cc  | 42 +-
 .../riscv/rvv/base/float-point-dynamic-frm-74.c|  2 +-
 2 files changed, 17 insertions(+), 27 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 1e56ee5dcb63..3fd18c1646dc 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -107,6 +107,8 @@ along with GCC; see the file COPYING3.  If not see
 /* True the mode switching has static frm, or false.  */
 #define STATIC_FRM_P(c) ((c)->machine->mode_sw_info.static_frm_p)

[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] RISC-V: Reconcile the existing test for vrem.vx combine

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:b48ccf650b32ff2af612029da693b83c23c490bc

commit b48ccf650b32ff2af612029da693b83c23c490bc
Author: Pan Li 
Date:   Sun Jun 8 16:50:52 2025 +0800

RISC-V: Reconcile the existing test for vrem.vx combine

Some existing vrem related test need some adjust for the
asm check due to cost model.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c: Adjust the
asm check for vrem.
* gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c: Ditto.

Signed-off-by: Pan Li 
(cherry picked from commit 4df4acf002cc3672478edb43f374cef3ffbd1f54)

Diff:
---
 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c | 4 ++--
 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c
index a87a6c70df1f..ad918a9b800a 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c
@@ -2,8 +2,8 @@
 
 #include "vrem-template.h"
 
-/* { dg-final { scan-assembler-times {\tvrem\.vv} 5 } } */
-/* { dg-final { scan-assembler-times {\tvrem\.vx} 3 } } */
+/* { dg-final { scan-assembler-times {\tvrem\.vv} 8 } } */
+/* { dg-final { scan-assembler-not {\tvrem\.vx} } } */
 /* { dg-final { scan-assembler-times {\tvremu\.vv} 5 } } */
 /* { dg-final { scan-assembler-times {\tvremu\.vx} 3 } } */
 /* { dg-final { scan-tree-dump-times "\.COND_LEN_MOD" 16 "optimized" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c
index 938169574aac..4e28f99e2886 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c
@@ -3,8 +3,8 @@
 
 #include "vrem-template.h"
 
-/* { dg-final { scan-assembler-times {\tvrem\.vv} 4 } } */
-/* { dg-final { scan-assembler-times {\tvrem\.vx} 4 } } */
+/* { dg-final { scan-assembler-times {\tvrem\.vv} 8 } } */
+/* { dg-final { scan-assembler-not {\tvrem\.vx} } } */
 /* { dg-final { scan-assembler-times {\tvremu\.vv} 4 } } */
 /* { dg-final { scan-assembler-times {\tvremu\.vx} 4 } } */
 /* { dg-final { scan-tree-dump-times "\.COND_LEN_MOD" 16 "optimized" } } */


[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] RISC-V: Add test for vec_duplicate + vrem.vv combine case 0 with GR2VR cost 0, 2 and 15

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:a4bcae5ad4962e1c6f6337ae93e75aeddecc922e

commit a4bcae5ad4962e1c6f6337ae93e75aeddecc922e
Author: Pan Li 
Date:   Sun Jun 8 16:53:05 2025 +0800

RISC-V: Add test for vec_duplicate + vrem.vv combine case 0 with GR2VR cost 
0, 2 and 15

Add asm dump check test for vec_duplicate + vrem.vv combine to vrem.vx,
with the GR2VR cost is 0, 2 and 15.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check
for vrem.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test
data for run test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vrem-run-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vrem-run-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vrem-run-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vrem-run-1-i8.c: New test.

Signed-off-by: Pan Li 
(cherry picked from commit daee1935f4e366c09fc085905cb49bbf264c5663)

Diff:
---
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i8.c   |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i32.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i64.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i8.c   |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i32.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i64.c  |   2 +
 .../gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i8.c   |   2 +
 .../riscv/rvv/autovec/vx_vf/vx_binary_data.h   | 196 +
 .../riscv/rvv/autovec/vx_vf/vx_vrem-run-1-i16.c|  15 ++
 .../riscv/rvv/autovec/vx_vf/vx_vrem-run-1-i32.c|  15 ++
 .../riscv/rvv/autovec/vx_vf/vx_vrem-run-1-i64.c|  15 ++
 .../riscv/rvv/autovec/vx_vf/vx_vrem-run-1-i8.c |  15 ++
 17 files changed, 280 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c
index d88e76b5d99c..893d910538ca 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c
@@ -13,6 +13,7 @@ DEF_VX_BINARY_CASE_0_WRAP(T, |, or)
 DEF_VX_BINARY_CASE_0_WRAP(T, ^, xor)
 DEF_VX_BINARY_CASE_0_WRAP(T, *, mul)
 DEF_VX_BINARY_CASE_0_WRAP(T, /, div)
+DEF_VX_BINARY_CASE_0_WRAP(T, %, rem)
 
 /* { dg-final { scan-assembler-times {vadd.vx} 1 } } */
 /* { dg-final { scan-assembler-times {vsub.vx} 1 } } */
@@ -22,3 +23,4 @@ DEF_VX_BINARY_CASE_0_WRAP(T, /, div)
 /* { dg-final { scan-assembler-times {vxor.vx} 1 } } */
 /* { dg-final { scan-assembler-times {vmul.vx} 1 } } */
 /* { dg-final { scan-assembler-times {vdiv.vx} 1 } } */
+/* { dg-final { scan-assembler-times {vrem.vx} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c
index 53189c21d041..26170de40d0c 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c
@@ -13,6 +13,7 @@ DEF_VX_BINARY_CASE_0_WRAP(T, |, or)
 DEF_VX_BINARY_CASE_0_WRAP(T, ^, xor)
 DEF_VX_BINARY_CASE_0_WRAP(T, *, mul)
 DEF_VX_BINARY_CASE_0_WRAP(T, /, div)
+DEF_VX_BINARY_CASE_0_WRAP(T, %, rem)
 
 /* { dg-final { scan-assembler-times {vadd.vx} 1 } } */
 /* { dg-final { scan-assembler-times {vsub.vx} 1 } } */
@@ -22,3 +23,4 @@ DEF_VX_BINARY_CASE_0_WRAP(T, /, div)
 /* { dg-final { scan-assembler-times {vxor.vx} 1 } } */
 /* { dg-final { scan-assembler-times {vmul.vx} 1 } } */
 /* { dg-final { scan-assembler-times {vdiv.vx} 1 } } */
+/* { dg-final { scan-assembler-times {vrem.vx} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c
index 5059beb4c6de..04d1fcb5f81f 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c
+++ b/gcc/testsuite/gcc.target/ris

[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] RISC-V: Add test for vec_duplicate + vremu.vv combine case 1 with GR2VR cost 0, 1 and 2

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:64b8deea76e40d16bbcb66ed7204f72cfe4d9669

commit 64b8deea76e40d16bbcb66ed7204f72cfe4d9669
Author: Pan Li 
Date:   Mon Jun 9 16:35:47 2025 +0800

RISC-V: Add test for vec_duplicate + vremu.vv combine case 1 with GR2VR 
cost 0, 1 and 2

Add asm dump check test for vec_duplicate + vremu.vv combine to vremu.vx,
with the GR2VR cost is 0, 1 and 2.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Add asm check
for vremu.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u8.c: Ditto.

Signed-off-by: Pan Li 
(cherry picked from commit bcabb6b0c707271b86a59be755f295ab7c125df1)

Diff:
---
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u8.c  | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u16.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u32.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u64.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u8.c  | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u16.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u32.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u64.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u8.c  | 2 ++
 12 files changed, 24 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c
index 58e4a1e96d6c..16ccaea251b8 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c
@@ -12,6 +12,7 @@ DEF_VX_BINARY_CASE_1_WRAP(T, &, and, VX_BINARY_BODY_X16)
 DEF_VX_BINARY_CASE_1_WRAP(T, |, or, VX_BINARY_BODY_X16)
 DEF_VX_BINARY_CASE_1_WRAP(T, ^, xor, VX_BINARY_BODY_X16)
 DEF_VX_BINARY_CASE_1_WRAP(T, /, div, VX_BINARY_BODY_X16)
+DEF_VX_BINARY_CASE_1_WRAP(T, %, rem, VX_BINARY_BODY_X16)
 
 /* { dg-final { scan-assembler {vadd.vx} } } */
 /* { dg-final { scan-assembler {vsub.vx} } } */
@@ -20,3 +21,4 @@ DEF_VX_BINARY_CASE_1_WRAP(T, /, div, VX_BINARY_BODY_X16)
 /* { dg-final { scan-assembler {vor.vx} } } */
 /* { dg-final { scan-assembler {vxor.vx} } } */
 /* { dg-final { scan-assembler {vdivu.vx} } } */
+/* { dg-final { scan-assembler {vremu.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c
index 3d5f53568dbe..0e2ab8d7838d 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c
@@ -12,6 +12,7 @@ DEF_VX_BINARY_CASE_1_WRAP(T, &, and, VX_BINARY_BODY_X4)
 DEF_VX_BINARY_CASE_1_WRAP(T, |, or, VX_BINARY_BODY_X4)
 DEF_VX_BINARY_CASE_1_WRAP(T, ^, xor, VX_BINARY_BODY_X4)
 DEF_VX_BINARY_CASE_1_WRAP(T, /, div, VX_BINARY_BODY_X4)
+DEF_VX_BINARY_CASE_1_WRAP(T, %, rem, VX_BINARY_BODY_X4)
 
 /* { dg-final { scan-assembler {vadd.vx} } } */
 /* { dg-final { scan-assembler {vsub.vx} } } */
@@ -20,3 +21,4 @@ DEF_VX_BINARY_CASE_1_WRAP(T, /, div, VX_BINARY_BODY_X4)
 /* { dg-final { scan-assembler {vor.vx} } } */
 /* { dg-final { scan-assembler {vxor.vx} } } */
 /* { dg-final { scan-assembler {vdivu.vx} } } */
+/* { dg-final { scan-assembler {vremu.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c
index 0edb9257a7a7..80eb8a4752e3 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c
@@ -12,6 +12,7 @@ DEF_VX_BINARY_CASE_1_WRAP(T, &, and, VX_BINARY_BODY)
 DEF_VX_BINARY_CASE_1_WRAP(T, |, or, VX_BINARY_BODY)
 DEF_VX_BINARY_CASE_1_WRAP(T, ^, xor, VX_BINARY_BODY)
 DEF_VX_BINARY_CASE_1_WRAP(T, /, div, VX_BINARY_BODY)
+DEF_VX_BINARY_CASE_1_WRAP(T, %, rem, VX_BINARY_BODY)
 
 /* { dg-final { scan-assembler {vadd.vx} } } */
 /* { dg-final { scan-assembler {vsub.vx} } } */
@@ -20,3 +21,4 @@ DEF_VX_BINARY_CASE_1_WRAP(T, /, div, VX_BINARY_BODY)
 /* { dg-final { scan-as

[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] RISC-V: Add test for vec_duplicate + vrem.vv combine case 1 with GR2VR cost 0, 1 and 2

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:725444b6f84ccd8bf5c6864dbf29582ccead900f

commit 725444b6f84ccd8bf5c6864dbf29582ccead900f
Author: Pan Li 
Date:   Sun Jun 8 16:55:34 2025 +0800

RISC-V: Add test for vec_duplicate + vrem.vv combine case 1 with GR2VR cost 
0, 1 and 2

Add asm dump check test for vec_duplicate + vrem.vv combine to vrem.vx,
with the GR2VR cost is 0, 1 and 2.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check
for vrem.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i8.c: Ditto.

Signed-off-by: Pan Li 
(cherry picked from commit 8d745f6d70172132a594dcc650a6d489e7246eda)

Diff:
---
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i8.c  | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i32.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i64.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i8.c  | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i32.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i64.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i8.c  | 2 ++
 12 files changed, 24 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c
index 1e409dea08b7..b35e4b712f08 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c
@@ -13,6 +13,7 @@ DEF_VX_BINARY_CASE_1_WRAP(T, |, or, VX_BINARY_BODY_X16)
 DEF_VX_BINARY_CASE_1_WRAP(T, ^, xor, VX_BINARY_BODY_X16)
 DEF_VX_BINARY_CASE_1_WRAP(T, *, mul, VX_BINARY_BODY_X16)
 DEF_VX_BINARY_CASE_1_WRAP(T, /, div, VX_BINARY_BODY_X16)
+DEF_VX_BINARY_CASE_1_WRAP(T, %, rem, VX_BINARY_BODY_X16)
 
 /* { dg-final { scan-assembler {vadd.vx} } } */
 /* { dg-final { scan-assembler {vsub.vx} } } */
@@ -22,3 +23,4 @@ DEF_VX_BINARY_CASE_1_WRAP(T, /, div, VX_BINARY_BODY_X16)
 /* { dg-final { scan-assembler {vxor.vx} } } */
 /* { dg-final { scan-assembler {vmul.vx} } } */
 /* { dg-final { scan-assembler {vdiv.vx} } } */
+/* { dg-final { scan-assembler {vrem.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c
index 2f242c73717e..fb01a6ab92d9 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c
@@ -13,6 +13,7 @@ DEF_VX_BINARY_CASE_1_WRAP(T, |, or, VX_BINARY_BODY_X4)
 DEF_VX_BINARY_CASE_1_WRAP(T, ^, xor, VX_BINARY_BODY_X4)
 DEF_VX_BINARY_CASE_1_WRAP(T, *, mul, VX_BINARY_BODY_X4)
 DEF_VX_BINARY_CASE_1_WRAP(T, /, div, VX_BINARY_BODY_X4)
+DEF_VX_BINARY_CASE_1_WRAP(T, %, rem, VX_BINARY_BODY_X4)
 
 /* { dg-final { scan-assembler {vadd.vx} } } */
 /* { dg-final { scan-assembler {vsub.vx} } } */
@@ -22,3 +23,4 @@ DEF_VX_BINARY_CASE_1_WRAP(T, /, div, VX_BINARY_BODY_X4)
 /* { dg-final { scan-assembler {vxor.vx} } } */
 /* { dg-final { scan-assembler {vmul.vx} } } */
 /* { dg-final { scan-assembler {vdiv.vx} } } */
+/* { dg-final { scan-assembler {vrem.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c
index f027bd8129e4..d9341d6b4d24 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c
@@ -13,6 +13,7 @@ DEF_VX_BINARY_CASE_1_WRAP(T, |, or, VX_BINARY_BODY)
 DEF_VX_BINARY_CASE_1_WRAP(T, ^, xor, VX_BINARY_BODY)
 DEF_VX_BINARY_CASE_1_WRAP(T, *, mul, VX_BINARY_BODY)
 DEF_VX_BINARY_CASE_1_WRAP(T, /, div, VX_BINARY_BODY)
+DEF_VX_BINARY_CASE_1_WRAP(T, %, rem, VX_BINARY_BODY)
 
 /* { dg-final { scan-assembler {vadd.vx} } } */
 /* { dg-final { scan-assembler {vsub.vx} } } */
@@ -22,3 +23,4 @@ DEF_VX_BINARY_CASE_1_WRAP(T, /, div, VX_BINARY_BODY)
 /* { dg-final { scan-assemble

[gcc(refs/vendors/riscv/heads/gcc-15-with-riscv-opts)] RISC-V: Combine vec_duplicate + vrem.vv to vrem.vx on GR2VR cost

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:852bbabde714134af1940772450e059d26bc5b65

commit 852bbabde714134af1940772450e059d26bc5b65
Author: Pan Li 
Date:   Sun Jun 8 16:48:33 2025 +0800

RISC-V: Combine vec_duplicate + vrem.vv to vrem.vx on GR2VR cost

This patch would like to combine the vec_duplicate + vrem.vv to the
vrem.vx.  From example as below code.  The related pattern will depend
on the cost of vec_duplicate from GR2VR.  Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.

Assume we have example code like below, GR2VR cost is 0.

  #define DEF_VX_BINARY(T, OP)\
  void\
  test_vx_binary (T * restrict out, T * restrict in, T x, unsigned n) \
  {   \
for (unsigned i = 0; i < n; i++)  \
  out[i] = in[i] OP x;\
  }

  DEF_VX_BINARY(int32_t, /)

Before this patch:
  10   │ test_vx_binary_or_int32_t_case_0:
  11   │ beq a3,zero,.L8
  12   │ vsetvli a5,zero,e32,m1,ta,ma
  13   │ vmv.v.x v2,a2
  14   │ sllia3,a3,32
  15   │ srlia3,a3,32
  16   │ .L3:
  17   │ vsetvli a5,a3,e32,m1,ta,ma
  18   │ vle32.v v1,0(a1)
  19   │ sllia4,a5,2
  20   │ sub a3,a3,a5
  21   │ add a1,a1,a4
  22   │ vrem.vv v1,v1,v2
  23   │ vse32.v v1,0(a0)
  24   │ add a0,a0,a4
  25   │ bne a3,zero,.L3

After this patch:
  10   │ test_vx_binary_or_int32_t_case_0:
  11   │ beq a3,zero,.L8
  12   │ sllia3,a3,32
  13   │ srlia3,a3,32
  14   │ .L3:
  15   │ vsetvli a5,a3,e32,m1,ta,ma
  16   │ vle32.v v1,0(a1)
  17   │ sllia4,a5,2
  18   │ sub a3,a3,a5
  19   │ add a1,a1,a4
  20   │ vrem.vx v1,v1,a2
  21   │ vse32.v v1,0(a0)
  22   │ add a0,a0,a4
  23   │ bne a3,zero,.L3

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_vx_binary_vec_vec_dup): Add new
case MOD.
* config/riscv/riscv.cc (riscv_rtx_costs): Ditto.
* config/riscv/vector-iterators.md: Add new op mod.

Signed-off-by: Pan Li 
(cherry picked from commit b96e319dbd19328a2243b2950155be57532c213b)

Diff:
---
 gcc/config/riscv/riscv-v.cc  | 1 +
 gcc/config/riscv/riscv.cc| 1 +
 gcc/config/riscv/vector-iterators.md | 2 +-
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 6a7eb7161b37..c31ec9e9b419 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -5569,6 +5569,7 @@ expand_vx_binary_vec_vec_dup (rtx op_0, rtx op_1, rtx 
op_2,
 case MULT:
 case DIV:
 case UDIV:
+case MOD:
   icode = code_for_pred_scalar (code, mode);
   break;
 default:
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 3fd18c1646dc..f98072cca7ce 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3949,6 +3949,7 @@ riscv_rtx_costs (rtx x, machine_mode mode, int 
outer_code, int opno ATTRIBUTE_UN
{
case DIV:
case UDIV:
+   case MOD:
  *total = get_vector_binary_rtx_cost (op, scalar2vr_cost);
  break;
default:
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index 36301b0be6e7..b1fd607320ef 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -4042,7 +4042,7 @@
 ])
 
 (define_code_iterator any_int_binop_no_shift_v_vdup [
-  plus minus and ior xor mult div udiv
+  plus minus and ior xor mult div udiv mod
 ])
 
 (define_code_iterator any_int_binop_no_shift_vdup_v [


[gcc r16-1392] [RISC-V] Fix ICE due to splitter emitting constant loads directly

2025-06-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:b93d8873cda88f0892c7782b274904fa8d3751fb

commit r16-1392-gb93d8873cda88f0892c7782b274904fa8d3751fb
Author: Jeff Law 
Date:   Tue Jun 10 06:38:52 2025 -0600

[RISC-V] Fix ICE due to splitter emitting constant loads directly

This is a fix for a bug found internally in Ventana using the cf3 testsuite.

cf3 looks to be dead as a project and likely subsumed by modern fuzzers.  In
fact internally we tripped another issue with cf3 that had already been
reported by Edwin with the fuzzer he runs.

Anyway, the splitter in question blindly emits the 2nd adjusted constant 
into a
register, that's not valid if the constant requires any kind of synthesis --
and it well could since we're mostly focused on the first constant turning 
into
something that can be loaded via LUI without increasing the cost of the 
second
constant.

Instead of using the split RTL template, this just emits the code we want
directly, using riscv_move_insn to synthesize the constant into the provided
temporary register.

Tested in my system.  Waiting on upstream CI's verdict before moving 
forward.

gcc/
* config/riscv/riscv.md (lui-constraintand_to_or): Do not 
use
the RTL template for split code.  Emit it directly taking care to 
avoid
emitting a constant load that needed synthesis.  Fix formatting.

gcc/testsuite/
* gcc.target/riscv/ventana-16122.c: New test.

Diff:
---
 gcc/config/riscv/riscv.md  | 18 +-
 gcc/testsuite/gcc.target/riscv/ventana-16122.c | 19 +++
 2 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 6d3c80a04c74..3aed25c25880 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -884,7 +884,7 @@
 ;; Where C1 is not a LUI operand, but ~C1 is a LUI operand
 
 (define_insn_and_split "*lui_constraint_and_to_or"
-   [(set (match_operand:X 0 "register_operand" "=r")
+  [(set (match_operand:X 0 "register_operand" "=r")
(plus:X (and:X (match_operand:X 1 "register_operand" "r")
   (match_operand 2 "const_int_operand"))
(match_operand 3 "const_int_operand")))
@@ -898,13 +898,21 @@
<= riscv_const_insns (operands[3], false)))"
   "#"
   "&& reload_completed"
-  [(set (match_dup 4) (match_dup 5))
-   (set (match_dup 0) (ior:X (match_dup 1) (match_dup 4)))
-   (set (match_dup 4) (match_dup 6))
-   (set (match_dup 0) (minus:X (match_dup 0) (match_dup 4)))]
+  [(const_int 0)]
   {
 operands[5] = GEN_INT (~INTVAL (operands[2]));
 operands[6] = GEN_INT ((~INTVAL (operands[2])) | (-INTVAL (operands[3])));
+
+/* This is always a LUI operand, so it's safe to just emit.  */
+emit_move_insn (operands[4], operands[5]);
+
+rtx x = gen_rtx_IOR (word_mode, operands[1], operands[4]);
+emit_move_insn (operands[0], x);
+
+/* This may require multiple steps to synthesize.  */
+riscv_emit_move (operands[4], operands[6]);
+x = gen_rtx_MINUS (word_mode, operands[0], operands[4]);
+emit_move_insn (operands[0], x);
   }
   [(set_attr "type" "arith")])
 
diff --git a/gcc/testsuite/gcc.target/riscv/ventana-16122.c 
b/gcc/testsuite/gcc.target/riscv/ventana-16122.c
new file mode 100644
index ..59e6467b57c0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/ventana-16122.c
@@ -0,0 +1,19 @@
+/* { dg-do compile { target { rv64 } } } */
+
+extern void NG (void);
+typedef signed char int8_t;
+typedef signed short int16_t;
+typedef signed int int32_t;
+void f74(void) {
+   int16_t x309 = 0x7fff;
+   volatile int32_t x310 = 0x7fff;
+   int8_t x311 = 59;
+   int16_t x312 = -0x8000;
+   static volatile int32_t t74 = 614992577;
+
+t74 = (x309==((x310^x311)%x312));
+
+if (t74 != 0) { NG(); } else { ; }
+   
+}
+


[gcc r12-11132] tree-sra: Do not create stores into const aggregates (PR111873)

2025-06-10 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:16d6a270b11a00d30966d42d9bc086e5873b5632

commit r12-11132-g16d6a270b11a00d30966d42d9bc086e5873b5632
Author: Martin Jambor 
Date:   Wed May 14 12:08:24 2025 +0200

tree-sra: Do not create stores into const aggregates (PR111873)

This patch fixes (hopefully the) one remaining place where gimple SRA
was still creating a load into const aggregates.  It occurs when there
is a replacement for a load but that replacement is not type
compatible - typically because it is a single field structure.

I have used testcases from duplicates because the original test-case
no longer reproduces for me.

gcc/ChangeLog:

2025-05-13  Martin Jambor  

PR tree-optimization/111873
* tree-sra.cc (sra_modify_expr): When processing a load which has
a type-incompatible replacement, do not store the contents of the
replacement into the original aggregate when that aggregate is
const.

gcc/testsuite/ChangeLog:

2025-05-13  Martin Jambor  

* gcc.dg/ipa/pr120044-1.c: New test.
* gcc.dg/ipa/pr120044-2.c: Likewise.
* gcc.dg/tree-ssa/pr114864.c: Likewise.

(cherry picked from commit 9d039eff453f777c58642ff16178c1ce2a4be6ab)

Diff:
---
 gcc/testsuite/gcc.dg/ipa/pr120044-1.c| 17 +
 gcc/testsuite/gcc.dg/ipa/pr120044-2.c| 17 +
 gcc/testsuite/gcc.dg/tree-ssa/pr114864.c | 15 +++
 gcc/tree-sra.cc  |  4 +++-
 4 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/ipa/pr120044-1.c 
b/gcc/testsuite/gcc.dg/ipa/pr120044-1.c
new file mode 100644
index ..f9fee3e85afb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr120044-1.c
@@ -0,0 +1,17 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-early-inlining -fno-tree-fre -fno-tree-pre 
-fno-code-hoisting -fno-inline" } */
+
+struct a {
+  int b;
+} const c;
+void d(char p, struct a e) {
+  while (e.b)
+;
+}
+static unsigned short f(const struct a g) {
+  d(g.b, g);
+  return g.b;
+}
+int main() {
+  return f(c);
+}
diff --git a/gcc/testsuite/gcc.dg/ipa/pr120044-2.c 
b/gcc/testsuite/gcc.dg/ipa/pr120044-2.c
new file mode 100644
index ..5130791f5444
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr120044-2.c
@@ -0,0 +1,17 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-early-inlining -fno-tree-fre -fno-tree-pre 
-fno-code-hoisting -fno-ipa-cp" } */
+
+struct a {
+  int b;
+} const c;
+void d(char p, struct a e) {
+  while (e.b)
+;
+}
+static unsigned short f(const struct a g) {
+  d(g.b, g);
+  return g.b;
+}
+int main() {
+  return f(c);
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c
new file mode 100644
index ..cd9b94c094fc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c
@@ -0,0 +1,15 @@
+/* { dg-do run } */
+/* { dg-options "-O1 -fno-tree-dce -fno-tree-fre" } */
+
+struct a {
+  int b;
+} const c;
+void d(const struct a f) {}
+void e(const struct a f) {
+  f.b == 0 ? 1 : f.b;
+  d(f);
+}
+int main() {
+  e(c);
+  return 0;
+}
diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index 91af2aef8b4c..4a7836bc257b 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -3871,8 +3871,10 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, 
bool write)
}
  else
{
- gassign *stmt;
+ if (TREE_READONLY (access->base))
+   return false;
 
+ gassign *stmt;
  if (access->grp_partial_lhs)
repl = force_gimple_operand_gsi (gsi, repl, true, NULL_TREE,
 true, GSI_SAME_STMT);


[gcc r16-1400] expand: Use less costly from sign and zero extensions for values where value range says they don't h

2025-06-10 Thread Jakub Jelinek via Gcc-cvs
https://gcc.gnu.org/g:2e4688a7202d73baeb4de18ca4591e6b0985f4a4

commit r16-1400-g2e4688a7202d73baeb4de18ca4591e6b0985f4a4
Author: Jakub Jelinek 
Date:   Tue Jun 10 20:06:14 2025 +0200

expand: Use less costly from sign and zero extensions for values where 
value range says they don't have MSB set [PR120434]

On top of the just posted patch, the following patch attempts to
use value range to see if MSB is known to be false and for scalar integral
extension in that case tries to expand both sign and zero extension and
chooses based on RTX costs the cheaper one (if the costs are the same
uses what it used before, TYPE_UNSIGNED (TREE_TYPE (treeop0)) based.

The patch regresses the gcc.target/i386/pr78103-3.c test, will post
a separate patch for that momentarily (with the intent that if all 3
patches are approved, I'll commit the PR78103 related one before this one).

2025-06-10  Jakub Jelinek  

PR middle-end/120434
* expr.cc (expand_expr_real_2) : If get_range_pos_neg
at -O2 for scalar integer extension suggests the most significant
bit of op0 is not set, try both unsigned and signed conversion and
choose the cheaper one.  If both are the same cost, choose one
based on TYPE_UNSIGNED (TREE_TYPE (treeop0)).

* gcc.target/i386/pr120434-2.c: New test.

Diff:
---
 gcc/expr.cc| 64 +++---
 gcc/testsuite/gcc.target/i386/pr120434-2.c | 15 +++
 2 files changed, 74 insertions(+), 5 deletions(-)

diff --git a/gcc/expr.cc b/gcc/expr.cc
index 08a58fb5564e..ac4fdfaa2181 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -9879,14 +9879,68 @@ expand_expr_real_2 (const_sepops ops, rtx target, 
machine_mode tmode,
op0 = gen_rtx_fmt_e (TYPE_UNSIGNED (TREE_TYPE (treeop0))
 ? ZERO_EXTEND : SIGN_EXTEND, mode, op0);
 
+  else if (SCALAR_INT_MODE_P (GET_MODE (op0))
+  && optimize >= 2
+  && SCALAR_INT_MODE_P (mode)
+  && (GET_MODE_SIZE (as_a  (mode))
+  > GET_MODE_SIZE (as_a  (GET_MODE (op0
+  && get_range_pos_neg (treeop0,
+currently_expanding_gimple_stmt) == 1)
+   {
+ /* If argument is known to be positive when interpreted
+as signed, we can expand it as both sign and zero
+extension.  Choose the cheaper sequence in that case.  */
+ bool speed_p = optimize_insn_for_speed_p ();
+ rtx uns_ret = NULL_RTX, sgn_ret = NULL_RTX;
+ do_pending_stack_adjust ();
+ start_sequence ();
+ if (target == NULL_RTX)
+   uns_ret = convert_to_mode (mode, op0, 1);
+ else
+   convert_move (target, op0, 1);
+ rtx_insn *uns_insns = end_sequence ();
+ start_sequence ();
+ if (target == NULL_RTX)
+   sgn_ret = convert_to_mode (mode, op0, 0);
+ else
+   convert_move (target, op0, 0);
+ rtx_insn *sgn_insns = end_sequence ();
+ unsigned uns_cost = seq_cost (uns_insns, speed_p);
+ unsigned sgn_cost = seq_cost (sgn_insns, speed_p);
+ bool was_tie = false;
+
+ /* If costs are the same then use as tie breaker the other other
+factor.  */
+ if (uns_cost == sgn_cost)
+   {
+ uns_cost = seq_cost (uns_insns, !speed_p);
+ sgn_cost = seq_cost (sgn_insns, !speed_p);
+ was_tie = true;
+   }
+
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file, ";; positive extension:%s unsigned cost: %u; "
+   "signed cost: %u\n",
+was_tie ? " (needed tie breaker)" : "",
+uns_cost, sgn_cost);
+ if (uns_cost < sgn_cost
+ || (uns_cost == sgn_cost && TYPE_UNSIGNED (TREE_TYPE (treeop0
+   {
+ emit_insn (uns_insns);
+ sgn_ret = uns_ret;
+   }
+ else
+   emit_insn (sgn_insns);
+ if (target == NULL_RTX)
+   op0 = sgn_ret;
+ else
+   op0 = target;
+   }
   else if (target == 0)
-   op0 = convert_to_mode (mode, op0,
-  TYPE_UNSIGNED (TREE_TYPE
- (treeop0)));
+   op0 = convert_to_mode (mode, op0, TYPE_UNSIGNED (TREE_TYPE (treeop0)));
   else
{
- convert_move (target, op0,
-   TYPE_UNSIGNED (TREE_TYPE (treeop0)));
+ convert_move (target, op0, TYPE_UNSIGNED (TREE_TYPE (treeop0)));
  op0 = target;
}
 
diff --git a/gcc/testsuite/gcc.target/i386/pr120434-2.c 
b/gcc/testsuite/gcc.target/i386/pr120434-2.c
new file mode 100644
index ..4381e3b3bfdc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr120434-2.c
@@ -0,0 +1,15 @@
+/* PR middle-end

[gcc r16-1398] expand, ranger: Use ranger during expansion [PR120434]

2025-06-10 Thread Jakub Jelinek via Gcc-cvs
https://gcc.gnu.org/g:8154fc95f097a146f9c80edcaafb2baff73065b5

commit r16-1398-g8154fc95f097a146f9c80edcaafb2baff73065b5
Author: Jakub Jelinek 
Date:   Tue Jun 10 20:04:52 2025 +0200

expand, ranger: Use ranger during expansion [PR120434]

As the following testcase shows, during expansion we use value range info
in lots of places, but sadly currently use only the global ranges.
It is mostly through get_range_pos_neg function, which uses
get_global_range_query ()->range_of_expr (arg1, arg2)
but other spots use it directly.

On the testcase at the end of the patch, in foo we don't know range of x,
so emit the at least on x86_64 less efficient signed division in that case.
In bar, the default def SSA_NAME has global range and we try to expand
the division both as signed and unsigned because the range proves they will
have the same result and choose the cheaper one.
And finally in baz, we have VARYING in global range, but can do better if
we ask for range at the statement we're expanding.

The main problem of using the ranger during expansion is that things are in
flux, the already expanded basic blocks switch their IL from gimple to RTL
(bb->flags & BB_RTL) and the gimple stmts are gone, PHI nodes even earlier,
etc.

The patch attempts to make the ranger usable by keeping (bb->flags & BB_RTL)
== 0 on basic blocks for longer, in particular until the last
expand_gimple_basic_block call for the function.  Instead of changing the
IL right away, it uses a vector indexed by bb->index to hold the
future BB_HEAD/BB_END.  I had to do a few changes on the ranger side and
maybe testing in the wild will show a few extra cases, but I think those
are tolerable and can be guarded with currently_expanding_to_rtl so that
we don't punt on consistency checks on normal GIMPLE.
In particular, even with the patch there will still be some BB_RTL
bbs in the IL, e.g. the initial block after ENTRY, ENTRY and EXIT blocks
and from time to time others as well, but those should never contain
anything intreresting from the ranger POV.  And switch expansion can drop
the default edge if it is __builtin_unreachable.

Also, had to change the internal call TER expansion, the current way
of temporarily changing gimple_call_lhs ICEd badly in the ranger, so I'm
instead temporarily changing SSA_NAME_VAR of the SSA_NAME.

2025-06-10  Jakub Jelinek  

PR middle-end/120434
* cfgrtl.h (update_bb_for_insn_chain): Declare.
* cfgrtl.cc (update_bb_for_insn_chain): No longer static.
* cfgexpand.h (expand_remove_edge): Declare.
* cfgexpand.cc: Include "gimple-range.h".
(head_end_for_bb): New variable.
(label_rtx_for_bb): Drop ATTRIBUTE_UNUSED from bb argument.
Use head_end_for_bb if possible for non-BB_RTL bbs.
(expand_remove_edge): New function.
(maybe_cleanup_end_of_block): Use it instead of remove_edge.
(expand_gimple_cond): Don't clear EDGE_TRUE_VALUE and
EDGE_FALSE_VALUE just yet.  Use head_end_for_bb elts instead
of BB_END and update_bb_for_insn_chain instead of 
update_bb_for_insn.
(expand_gimple_tailcall): Use expand_remove_edge instead of
remove_edge.  Use head_end_for_bb elts instead of BB_END and
update_bb_for_insn_chain instead of update_bb_for_insn.
(expand_gimple_basic_block): Don't change bb to BB_RTL here, instead
use head_end_for_bb elts instead of BB_HEAD and BB_END.  Use
update_bb_for_insn_chain instead of update_bb_for_insn.
(pass_expand::execute): Enable ranger before 
expand_gimple_basic_block
calls and create head_end_for_bb vector.  Disable ranger after
those calls, turn still non-BB_RTL blocks into BB_RTL and set their
BB_HEAD and BB_END from head_end_for_bb elts, and clear 
EDGE_TRUE_VALUE
and EDGE_FALSE_VALUE flags on edges.  Release head_end_for_bb
vector.
* tree-outof-ssa.cc (expand_phi_nodes): Don't clear phi nodes here.
* tree.h (get_range_pos_neg): Add gimple * argument defaulted to 
NULL.
* tree.cc (get_range_pos_neg): Add stmt argument.  Use
get_range_query (cfun) instead of get_global_range_query () and pass
stmt as third argument to range_of_expr.
* expr.cc (expand_expr_divmod): Pass currently_expanding_gimple_stmt
to get_range_pos_neg.
(expand_expr_real_1) : Change internal fn handling
to avoid temporarily overwriting gimple_call_lhs of ifn, instead
temporarily overwrite SSA_NAME_VAR of its lhs.
(maybe_optimize_pow2p_mod_cmp): Pass currently_expanding_gimple_stmt
to get_range_pos_neg.
(maybe_optimize_mod_cmp): Li

[gcc r16-1399] i386: Handle ZERO_EXTEND like SIGN_EXTEND in bsr patterns [PR120434]

2025-06-10 Thread Jakub Jelinek via Gcc-cvs
https://gcc.gnu.org/g:54da199f28da07166a44eae7d53acb9e3abe1306

commit r16-1399-g54da199f28da07166a44eae7d53acb9e3abe1306
Author: Jakub Jelinek 
Date:   Tue Jun 10 20:07:06 2025 +0200

i386: Handle ZERO_EXTEND like SIGN_EXTEND in bsr patterns [PR120434]

The just posted second PR120434 patch causes
+FAIL: gcc.target/i386/pr78103-3.c scan-assembler m(leaq|addq|incq)M
+FAIL: gcc.target/i386/pr78103-3.c scan-assembler-not mmovlM+
+FAIL: gcc.target/i386/pr78103-3.c scan-assembler-not msubqM
+FAIL: gcc.target/i386/pr78103-3.c scan-assembler-not mxor[lq]M
While the patch generally improves code generation by often using
ZERO_EXTEND instead of SIGN_EXTEND, where the former is often for free
on x86_64 while the latter requires an extra instruction or larger
instruction than one with just zero extend, the PR78103 combine patterns
and splitters were written only with SIGN_EXTEND in mind.  As CLZ is UB
on 0 and otherwise returns just [0,63] and is xored with 63, ZERO_EXTEND
does the same thing there as SIGN_EXTEND.

2025-06-10  Jakub Jelinek  

PR middle-end/120434
* config/i386/i386.md (*bsr_rex64_2): Rename to ...
(*bsr_rex64_2): ... this.  Use any_extend instead of sign_extend.
(*bsr_2): Rename to ...
(*bsr_2): ... this.  Use any_extend instead of sign_extend.
(bsr splitters after those): Use any_extend instead of sign_extend.

Diff:
---
 gcc/config/i386/i386.md | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 8eee44756eba..99f382497148 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -21512,11 +21512,12 @@
(set_attr "mode" "SI")])
 
 ; As bsr is undefined behavior on zero and for other input
-; values it is in range 0 to 63, we can optimize away sign-extends.
-(define_insn_and_split "*bsr_rex64_2"
+; values it is in range 0 to 63, we can optimize away sign-extends
+; or zero-extends.
+(define_insn_and_split "*bsr_rex64_2"
   [(set (match_operand:DI 0 "register_operand")
(xor:DI
- (sign_extend:DI
+ (any_extend:DI
(minus:SI
  (const_int 63)
  (subreg:SI (clz:DI (match_operand:DI 1 "nonimmediate_operand"))
@@ -21538,9 +21539,9 @@
   operands[3] = lowpart_subreg (SImode, operands[2], DImode);
 })
 
-(define_insn_and_split "*bsr_2"
+(define_insn_and_split "*bsr_2"
   [(set (match_operand:DI 0 "register_operand")
-   (sign_extend:DI
+   (any_extend:DI
  (xor:SI
(minus:SI
  (const_int 31)
@@ -21617,7 +21618,7 @@
(minus:DI
  (match_operand:DI 2 "const_int_operand")
  (xor:DI
-   (sign_extend:DI
+   (any_extend:DI
  (minus:SI (const_int 63)
(subreg:SI
  (clz:DI (match_operand:DI 1 "nonimmediate_operand"))
@@ -21647,7 +21648,7 @@
   [(set (match_operand:DI 0 "register_operand")
(minus:DI
  (match_operand:DI 2 "const_int_operand")
- (sign_extend:DI
+ (any_extend:DI
(xor:SI
  (minus:SI (const_int 31)
(clz:SI (match_operand:SI 1 "nonimmediate_operand")))


[gcc r16-1393] gcn: Add experimental MI300 (gfx942) support

2025-06-10 Thread Tobias Burnus via Gcc-cvs
https://gcc.gnu.org/g:37b454b7e171bd8a792cbe4c57ea0f9702afa22d

commit r16-1393-g37b454b7e171bd8a792cbe4c57ea0f9702afa22d
Author: Tobias Burnus 
Date:   Tue Jun 10 15:12:47 2025 +0200

gcn: Add experimental MI300 (gfx942) support

As gfx942 and gfx950 belong to gfx9-4-generic, the latter two are also 
added.
Note that there are no specific optimizations for MI300, yet.

For none of the mentioned devices, any multilib is build by default; use
'--with-multilib-list=' when configuring GCC to build them alongside.
gfx942 was added in LLVM (and its mc assembler, used by GCC) in version 18,
generic support in LLVM 19 and gfx950 in LLVM 20.

gcc/ChangeLog:

* config/gcn/gcn-devices.def: Add gfx942, gfx950 and gfx9-4-generic.
* config/gcn/gcn-opts.h (TARGET_CDNA3, TARGET_CDNA3_PLUS,
TARGET_GLC_NAME, TARGET_TARGET_SC_CACHE): Define.
(TARGET_ARCHITECTED_FLAT_SCRATCH): Use also for CDNA3.
* config/gcn/gcn.h (gcn_isa): Add ISA_CDNA3 to the enum.
* config/gcn/gcn.cc (print_operand): Update 'g' to use
TARGET_GLC_NAME; add 'G' to print TARGET_GLC_NAME unconditionally.
* config/gcn/gcn-valu.md (scatter, gather): Use TARGET_GLC_NAME.
* config/gcn/gcn.md: Use %G instead of glc; use 'buffer_inv 
sc1'
for TARGET_TARGET_SC_CACHE.
* doc/invoke.texi (march): Add gfx942, gfx950 and gfx9-4-generic.
* doc/install.texi (amdgcn*-*-*): Add gfx942, gfx950 and 
gfx9-4-generic.
* config/gcn/gcn-tables.opt: Regenerate.

libgomp/ChangeLog:

* testsuite/libgomp.c/declare-variant-4.h (gfx942): New variant 
function.
* testsuite/libgomp.c/declare-variant-4-gfx942.c: New test.

Diff:
---
 gcc/config/gcn/gcn-devices.def |  33 
 gcc/config/gcn/gcn-opts.h  |  13 +-
 gcc/config/gcn/gcn-tables.opt  |   9 ++
 gcc/config/gcn/gcn-valu.md |   8 +-
 gcc/config/gcn/gcn.cc  |   8 +-
 gcc/config/gcn/gcn.h   |   2 +
 gcc/config/gcn/gcn.md  | 168 +
 gcc/doc/install.texi   |  17 ++-
 gcc/doc/invoke.texi|  10 ++
 .../testsuite/libgomp.c/declare-variant-4-gfx942.c |   8 +
 libgomp/testsuite/libgomp.c/declare-variant-4.h|   8 +
 11 files changed, 208 insertions(+), 76 deletions(-)

diff --git a/gcc/config/gcn/gcn-devices.def b/gcc/config/gcn/gcn-devices.def
index af1420382e2f..426acf0cb7a5 100644
--- a/gcc/config/gcn/gcn-devices.def
+++ b/gcc/config/gcn/gcn-devices.def
@@ -171,6 +171,28 @@ GCN_DEVICE(gfx90c, GFX90C, 0x32, ISA_GCN5,
   /* Generic Name */ GFX9_GENERIC
   )
 
+GCN_DEVICE(gfx942, GFX942, 0x4c, ISA_CDNA3,
+  /* XNACK default */ HSACO_ATTR_ANY,
+  /* SRAM_ECC default */ HSACO_ATTR_ANY,
+  /* WAVE64 mode */ HSACO_ATTR_UNSUPPORTED,
+  /* CU mode */ HSACO_ATTR_UNSUPPORTED,
+  /* Max ISA VGPRs */ 512,
+  /* Generic code obj version */ 0,  /* non-generic */
+  /* Architecture Family */ GFX9,
+  /* Generic Name */ NONE
+  )
+
+GCN_DEVICE(gfx950, GFX950, 0x4f, ISA_CDNA3,
+  /* XNACK default */ HSACO_ATTR_ANY,
+  /* SRAM_ECC default */ HSACO_ATTR_ANY,
+  /* WAVE64 mode */ HSACO_ATTR_UNSUPPORTED,
+  /* CU mode */ HSACO_ATTR_UNSUPPORTED,
+  /* Max ISA VGPRs */ 512,
+  /* Generic code obj version */ 0,  /* non-generic */
+  /* Architecture Family */ GFX9,
+  /* Generic Name */ NONE
+  )
+
 GCN_DEVICE(gfx9-generic, GFX9_GENERIC, 0x051, ISA_GCN5,
   /* XNACK default */ HSACO_ATTR_ANY,
   /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
@@ -182,6 +204,17 @@ GCN_DEVICE(gfx9-generic, GFX9_GENERIC, 0x051, ISA_GCN5,
   /* Generic Name */ NONE
   )
 
+GCN_DEVICE(gfx9-4-generic, GFX9_4_GENERIC, 0x05f, ISA_CDNA3,
+  /* XNACK default */ HSACO_ATTR_ANY,
+  /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+  /* WAVE64 mode */ HSACO_ATTR_UNSUPPORTED,
+  /* CU mode */ HSACO_ATTR_UNSUPPORTED,
+  /* Max ISA VGPRs */ 256,
+  /* Generic code obj version */ 1,
+  /* Architecture Family */ GFX9,
+  /* Generic Name */ NONE
+  )
+
 /* GCN GFX10.3 (RDNA 2) */
 
 GCN_DEVICE(gfx1030, GFX1030, 0x36, ISA_RDNA2,
diff --git a/gcc/config/gcn/gcn-opts.h b/gcc/config/gcn/gcn-opts.h
index 88f562dfc1e1..bcea14f3fe7a 100644
--- a/gcc/config/gcn/gcn-opts.h
+++ b/gcc/config/gcn/gcn-opts.h
@@ -33,7 +33,8 @@ extern enum gcn_isa {
   ISA_RDNA2,
   ISA_RDNA3,
   ISA_CDNA1,
-  ISA_CDNA2
+  ISA_CDNA2,
+  ISA_CDNA3
 } gcn_isa;
 
 #define TARGET_GCN5 (gcn_isa == ISA_GCN5)
@@ -41,6 +42,8 @@ extern enum gcn_isa {
 #define TARGET_CDNA1_PLUS (gcn_isa >= ISA_CDNA1)
 

[gcc r16-1389] ada: Remove redundant guard against attribute with no expressions

2025-06-10 Thread Marc Poulhies via Gcc-cvs
https://gcc.gnu.org/g:5dc946f78e6d0ba73fd33990b3a353f113ecdd64

commit r16-1389-g5dc946f78e6d0ba73fd33990b3a353f113ecdd64
Author: Piotr Trojanek 
Date:   Wed Mar 26 18:42:10 2025 +0100

ada: Remove redundant guard against attribute with no expressions

We intentionally allow First to work on No_List, so there is no need to 
guard
against a No_List. Code cleanup; semantics is unaffected.

gcc/ada/ChangeLog:

* sem_attr.adb (Resolve_Attribute): Remove redundant guard.

Diff:
---
 gcc/ada/sem_attr.adb | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/ada/sem_attr.adb b/gcc/ada/sem_attr.adb
index d4034d28da60..4f5047f7b974 100644
--- a/gcc/ada/sem_attr.adb
+++ b/gcc/ada/sem_attr.adb
@@ -13014,7 +13014,6 @@ package body Sem_Attr is
 --  their Entity attribute to reference their discriminal.
 
 if Expander_Active
-  and then Present (Expressions (N))
   and then Attr_Id /= Attribute_Make
 then
declare


[gcc r16-1376] ada: VAST: create treewalker

2025-06-10 Thread Marc Poulhies via Gcc-cvs
https://gcc.gnu.org/g:3f30c88d17742ec2dff131c470b4ee6483d455e7

commit r16-1376-g3f30c88d17742ec2dff131c470b4ee6483d455e7
Author: Bob Duff 
Date:   Mon Mar 24 16:21:53 2025 -0400

ada: VAST: create treewalker

Walks all trees (not just the main unit), deals with switches and
flags. Doesn't check much of anything yet (asserts that "unused" nodes
are not present).

Move decisions (what tree(s) to check, what switches enable checking)
from the caller to the body of VAST.

gcc/ada/ChangeLog:

* vast.adb: Initial implementation.
* vast.ads: Rename procedure. Remove parameter; body should decide
what to do.
* lib.ads (ipu): Minor: Rewrite comment for brevity, and because
of an inconvenient misspelling.
(Num_Units): Not used; remove.
(Remove_Unit): Minor: Remove "Currently" (which was current a decade
ago from) comment.
* lib.adb (Num_Units): Not used; remove.
* debug_a.adb (Debug_A_Entry): Fix bug: Use Write_Name_For_Debug,
so this won't crash on the Error node.
* debug.adb: Document -gnatd_V and -gnatd_W compiler switches.
* exp_ch6.adb (Validate_Subprogram_Calls): Remove redundant check 
for
Serious_Errors_Detected. (We turn off code gen when errors are
detected.)
* frontend.adb: Move decisions into VAST body.
* namet.ads (Present): Remove unnecessary overriding; these are
inherited by the derived types.
* namet.adb (Present): Likewise.

Diff:
---
 gcc/ada/debug.adb|  11 +++--
 gcc/ada/debug_a.adb  |   7 +--
 gcc/ada/exp_ch6.adb  |  10 ++---
 gcc/ada/frontend.adb |   4 +-
 gcc/ada/lib.adb  |   9 
 gcc/ada/lib.ads  |  13 ++
 gcc/ada/namet.adb|  18 
 gcc/ada/namet.ads|   8 
 gcc/ada/vast.adb | 123 ---
 gcc/ada/vast.ads |   7 +--
 10 files changed, 139 insertions(+), 71 deletions(-)

diff --git a/gcc/ada/debug.adb b/gcc/ada/debug.adb
index 3a39ec89c40f..f250d7429a96 100644
--- a/gcc/ada/debug.adb
+++ b/gcc/ada/debug.adb
@@ -186,8 +186,8 @@ package body Debug is
--  d_S
--  d_T  Output trace information on invocation path recording
--  d_U  Disable prepending messages with "error:".
-   --  d_V  Enable verifications on the expanded tree
-   --  d_W
+   --  d_V  Enable VAST (verifications on the expanded tree)
+   --  d_W  Enable VAST in verbose mode
--  d_X  Disable assertions to check matching of extra formals
--  d_Y
--  d_Z
@@ -1065,8 +1065,11 @@ package body Debug is
--  d_U  Disable prepending 'error:' to error messages. This used to be the
--   default and can be seen as the opposite of -gnatU.
 
-   --  d_V  Enable verification of the expanded code before calling the backend
-   --   and generate error messages on each inconsistency found.
+   --  d_V  Enable VAST (Verifier for the Ada Semantic Tree). This does
+   --   verification of the expanded code before calling the backend.
+
+   --  d_W  Same as d_V, but also prints lots of tracing/debugging output
+   --   as it walks the tree.
 
--  d_X  Disable assertions to check matching of extra formals; switch added
--   temporarily to disable these checks until this work is complete if
diff --git a/gcc/ada/debug_a.adb b/gcc/ada/debug_a.adb
index d36ae696af64..8d68fc8eff7d 100644
--- a/gcc/ada/debug_a.adb
+++ b/gcc/ada/debug_a.adb
@@ -83,11 +83,8 @@ package body Debug_A is
 
  case Nkind (N) is
 when N_Has_Chars =>
-   Write_Str (" """);
-   if Present (Chars (N)) then
-  Write_Str (Get_Name_String (Chars (N)));
-   end if;
-   Write_Str ();
+   Write_Str (" ");
+   Write_Name_For_Debug (Chars (N));
 when others => null;
  end case;
 
diff --git a/gcc/ada/exp_ch6.adb b/gcc/ada/exp_ch6.adb
index 3a45b1c59340..2a246adbb8a3 100644
--- a/gcc/ada/exp_ch6.adb
+++ b/gcc/ada/exp_ch6.adb
@@ -9938,15 +9938,15 @@ package body Exp_Ch6 is
--  Start of processing for Validate_Subprogram_Calls
 
begin
-  --  No action required if we are not generating code or compiling sources
-  --  that have errors.
+  --  No action if we are not generating code (including if we have
+  --  errors).
 
-  if Serious_Errors_Detected > 0
-or else Operating_Mode /= Generate_Code
-  then
+  if Operating_Mode /= Generate_Code then
  return;
   end if;
 
+  pragma Assert (Serious_Errors_Detected = 0);
+
   Check_Calls (N);
end Validate_Subprogram_Calls;
 
diff --git a/gcc/ada/frontend.adb b/gcc/ada/frontend.adb
index 12cea9c794a3..d5376788ce42 100644
--- a/gcc/ada/frontend.adb
+++ b/gcc/ada/frontend.adb
@@ -506,9 +506,7 @@ begin
 
--  Verify the validity of the tree
 
-   if 

[gcc r15-9812] ada: Fix infinite loop with aggregate in generic unit

2025-06-10 Thread Eric Botcazou via Gcc-cvs
https://gcc.gnu.org/g:8a4b72a2d99918d6bc315f2664a22457b9848ce7

commit r15-9812-g8a4b72a2d99918d6bc315f2664a22457b9848ce7
Author: Eric Botcazou 
Date:   Thu Mar 20 23:29:33 2025 +0100

ada: Fix infinite loop with aggregate in generic unit

Root_Type does not return the same type for the private and the full view of
a derived private tagged type when both derive from an interface type.

gcc/ada/ChangeLog:

* sem_ch12.adb (Copy_Generic_Node): Do not call Root_Type to find
the root type of an aggregate of a derived tagged type.

Diff:
---
 gcc/ada/sem_ch12.adb | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/gcc/ada/sem_ch12.adb b/gcc/ada/sem_ch12.adb
index d93788b779e9..02c7c3696e82 100644
--- a/gcc/ada/sem_ch12.adb
+++ b/gcc/ada/sem_ch12.adb
@@ -9340,9 +9340,6 @@ package body Sem_Ch12 is
   and then Nkind (Ancestor_Type (N)) in N_Entity
 then
declare
-  Root_Typ : constant Entity_Id :=
-   Root_Type (Ancestor_Type (N));
-
   Typ : Entity_Id := Ancestor_Type (N);
 
begin
@@ -9351,7 +9348,7 @@ package body Sem_Ch12 is
 Switch_View (Typ);
  end if;
 
- exit when Typ = Root_Typ;
+ exit when Etype (Typ) = Typ;
 
  Typ := Etype (Typ);
   end loop;


  1   2   >