Re: [PATCH v3 6/6] aarch64: Add DLL import/export to AArch64 target

2024-06-25 Thread Christophe Lyon
On Fri, 21 Jun 2024 at 15:51, Richard Sandiford
 wrote:
>
> Evgeny Karpov  writes:
> > Monday, June 10, 2024 7:03 PM
> > Richard Sandiford  wrote:
> >
> >> Thanks for the update.  Parts 1-5 look good to me.  Some minor comments
> >> below about part 6:
> >>
> >> If the TARGET_DLLIMPORT_DECL_ATTRIBUTES condition can be dropped, the
> >> series is OK from my POV with that change and with the changes above.
> >> Please get sign-off from an x86 maintainer too though.
> >
> > Thank you for the review and suggestions. Here is the updated version of 
> > patch 6, based on the comments.
> > The x86 and mingw maintainers have already approved the series.
> >
> > Regards,
> > Evgeny
> >
> >
> >
> > This patch reuses the MinGW implementation to enable DLL import/export
> > functionality for the aarch64-w64-mingw32 target. It also modifies
> > environment configurations for MinGW.
> >
> > gcc/ChangeLog:
> >
> >   * config.gcc: Add winnt-dll.o, which contains the DLL
> >   import/export implementation.
> >   * config/aarch64/aarch64.cc (aarch64_legitimize_pe_coff_symbol):
> >   Add a conditional function that reuses the MinGW implementation
> >   for COFF and does nothing otherwise.
> >   (aarch64_expand_call): Add dllimport implementation.
> >   (aarch64_legitimize_address): Likewise.
> >   * config/aarch64/cygming.h (SYMBOL_FLAG_DLLIMPORT): Modify MinGW
> >   environment to support DLL import/export.
> >   (SYMBOL_FLAG_DLLEXPORT): Likewise.
> >   (SYMBOL_REF_DLLIMPORT_P): Likewise.
> >   (SYMBOL_FLAG_STUBVAR): Likewise.
> >   (SYMBOL_REF_STUBVAR_P): Likewise.
> >   (TARGET_VALID_DLLIMPORT_ATTRIBUTE_P): Likewise.
> >   (TARGET_ASM_FILE_END): Likewise.
> >   (SUB_TARGET_RECORD_STUB): Likewise.
> >   (GOT_ALIAS_SET): Likewise.
> >   (PE_COFF_EXTERN_DECL_SHOULD_BE_LEGITIMIZED): Likewise.
> >   (HAVE_64BIT_POINTERS): Likewise.
>
> OK, thanks.  If you'd like commit access, please follow the instructions
> on https://gcc.gnu.org/gitwrite.html , listing me as sponsor.
>

I've just pushed the series on Evgeny's behalf.

Thanks,

Christophe

> Richard.
>
> > ---
> >  gcc/config.gcc|  4 +++-
> >  gcc/config/aarch64/aarch64.cc | 26 ++
> >  gcc/config/aarch64/cygming.h  | 26 --
> >  3 files changed, 53 insertions(+), 3 deletions(-)
> >
> > diff --git a/gcc/config.gcc b/gcc/config.gcc
> > index d053b98efa8..331285b7b6d 100644
> > --- a/gcc/config.gcc
> > +++ b/gcc/config.gcc
> > @@ -1276,10 +1276,12 @@ aarch64-*-mingw*)
> >   tm_file="${tm_file} mingw/mingw32.h"
> >   tm_file="${tm_file} mingw/mingw-stdint.h"
> >   tm_file="${tm_file} mingw/winnt.h"
> > + tm_file="${tm_file} mingw/winnt-dll.h"
> >   tmake_file="${tmake_file} aarch64/t-aarch64"
> >   target_gtfiles="$target_gtfiles \$(srcdir)/config/mingw/winnt.cc"
> > + target_gtfiles="$target_gtfiles \$(srcdir)/config/mingw/winnt-dll.cc"
> >   extra_options="${extra_options} mingw/cygming.opt mingw/mingw.opt"
> > - extra_objs="${extra_objs} winnt.o"
> > + extra_objs="${extra_objs} winnt.o winnt-dll.o"
> >   c_target_objs="${c_target_objs} msformat-c.o"
> >   d_target_objs="${d_target_objs} winnt-d.o"
> >   tmake_file="${tmake_file} mingw/t-cygming"
> > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> > index 3418e57218f..32e31e08449 100644
> > --- a/gcc/config/aarch64/aarch64.cc
> > +++ b/gcc/config/aarch64/aarch64.cc
> > @@ -860,6 +860,10 @@ static const attribute_spec aarch64_gnu_attributes[] =
> >{ "Advanced SIMD type", 1, 1, false, true,  false, true,  NULL, NULL },
> >{ "SVE type",3, 3, false, true,  false, true,  NULL, 
> > NULL },
> >{ "SVE sizeless type",  0, 0, false, true,  false, true,  NULL, NULL },
> > +#if TARGET_DLLIMPORT_DECL_ATTRIBUTES
> > +  { "dllimport", 0, 0, false, false, false, false, handle_dll_attribute, 
> > NULL },
> > +  { "dllexport", 0, 0, false, false, false, false, handle_dll_attribute, 
> > NULL },
> > +#endif
> >  #ifdef SUBTARGET_ATTRIBUTE_TABLE
> >SUBTARGET_ATTRIBUTE_TABLE
> >  #endif
> > @@ -2865,6 +2869,15 @@ static void
> >  aarch64_load_symref_appropriately (rtx dest, rtx imm,
> >  enum aarch64_symbol_type type)
> >  {
> > +#if TARGET_PECOFF
> > +  rtx tmp = legitimize_pe_coff_symbol (imm, true);
> > +  if (tmp)
> > +{
> > +  emit_insn (gen_rtx_SET (dest, tmp));
> > +  return;
> > +}
> > +#endif
> > +
> >switch (type)
> >  {
> >  case SYMBOL_SMALL_ABSOLUTE:
> > @@ -11233,6 +11246,13 @@ aarch64_expand_call (rtx result, rtx mem, rtx 
> > cookie, bool sibcall)
> >
> >gcc_assert (MEM_P (mem));
> >callee = XEXP (mem, 0);
> > +
> > +#if TARGET_PECOFF
> > +  tmp = legitimize_pe_coff_symbol (callee, false);
> > +  if (tmp)
> > +callee = tmp;
> > +#endif
> > +
> >mode = GET_MODE (callee);
> >gcc_assert (mode == Pmode);
> >

Re: [PATCH 05/52] rust: Replace uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE

2024-06-25 Thread Kewen.Lin
on 2024/6/21 18:36, Kewen.Lin wrote:
> Hi Arthur,
> 
> on 2024/6/21 18:17, Arthur Cohen wrote:
>> Hi,
>>
>> Sorry about the delay in my answer! The patch looks good to me :) Will you 
>> push it as part of your patchset?
>>
> 
> Thanks for the review!  Since this one doesn't necessarily depend on
> "09/52 Replace {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE with new hook
> mode_for_floating_type", I'm going to push this before that (just like
> the other FE changes excepting for the jit one 10/52 which depends on
> the new hook 09/52).  btw, all after 09/52 would be merged into 09/52
> when committing. :)

Pushed as r15-1592-gbcd1b7a097031d, thanks!

BR,
Kewen


[PATCH] Add function filtering to gcov

2024-06-25 Thread Jørgen Kvalsvik
Add the --include and --exclude flags to gcov to control what functions
to report on. This is meant to make gcov more practical as an when
writing test suites or performing other coverage experiments, which
tends to focus on a few functions at the time. This really shines in
combination with the -t/--stdout flag. With support for more expansive
metrics in gcov like modified condition/decision coverage (MC/DC) and
path coverage, output quickly gets overwhelming without filtering.

The approach is quite simple: filters are egrep regexes and are
evaluated left-to-right, and the last filter "wins", that is, if a
function matches an --include and a subsequent --exclude, it should not
be included in the output. All of the output machinery works on the
function table, so by optionally (not) adding function makes the even
the json output work as expected, and only minor changes are needed to
suppress the filtered-out functions.

Demo: math.c

int mul (int a, int b) {
return a * b;
}

int sub (int a, int b) {
return a - b;
}

int sum (int a, int b) {
return a + b;
}

Plain matches:

$ gcov -t math --include=sum
-:0:Source:filter.c
-:0:Graph:filter.gcno
-:0:Data:-
-:0:Runs:0
#:9:int sum (int a, int b) {
#:   10:return a + b;

$ gcov -t math --include=mul
-:0:Source:filter.c
-:0:Graph:filter.gcno
-:0:Data:-
-:0:Runs:0
#:1:int mul (int a, int b) {
#:2:return a * b;

Regex match:

$ gcov -t math --include=su
-:0:Source:filter.c
-:0:Graph:filter.gcno
-:0:Data:-
-:0:Runs:0
#:5:int sub (int a, int b) {
#:6:return a - b;
-:7:}
#:9:int sum (int a, int b) {
#:   10:return a + b;

And similar for exclude:

$ gcov -t math --exclude=sum
-:0:Source:filter.c
-:0:Graph:filter.gcno
-:0:Data:-
-:0:Runs:0
#:1:int mul (int a, int b) {
#:2:return a * b;
-:3:}
#:5:int sub (int a, int b) {
#:6:return a - b;

And json, for good measure:

$ gcov -t math --include=sum --json | jq ".files[].lines[]"
{
  "line_number": 9,
  "function_name": "sum",
  "count": 0,
  "unexecuted_block": true,
  "block_ids": [],
  "branches": [],
  "calls": []
}
{
  "line_number": 10,
  "function_name": "sum",
  "count": 0,
  "unexecuted_block": true,
  "block_ids": [
2
  ],
  "branches": [],
  "calls": []
}

Note that the last function gets "clipped" when lines are associated to
functions, which means the closing brace is dropped from the report. I
hope this can be fixed, but considering it is not really a part of the
function body, the gcov report is "complete".

Matching generally work well for mangled names, as the mangled names
also have the base symbol name in it. By default, functions are matched
by the mangled name, which means matching on base names always work as
expected. The -M flag makes the matching work on the demangled name
which is quite useful when you only want to report on specific
overloads and can use the full type names.

Why not just use grep? grep is not really sufficient as grep is very
line oriented, and the reports that benefit the most from filtering
often unpredictably span multiple lines based on the state of coverage.
For example, a condition coverage report for 3 terms/6 outcomes only
outputs 1 line when all conditions are covered, and 7 with no lines
covered.

gcc/ChangeLog:

* doc/gcov.texi: Add --include, --exclude, --match-on-demangled
documentation.
* gcov.cc (struct fnfilter): New.
(print_usage): Add --include, --exclude, -M,
--match-on-demangled.
(process_args): Likewise.
(release_structures): Release filters.
(read_graph_file): Only add function_infos matching filters.
(output_lines): Likewise.

gcc/testsuite/ChangeLog:

* lib/gcov.exp: Add filtering test function.
* g++.dg/gcov/gcov-19.C: New test.
* g++.dg/gcov/gcov-20.C: New test.
* g++.dg/gcov/gcov-21.C: New test.
* gcc.misc-tests/gcov-25.c: New test.
* gcc.misc-tests/gcov-26.c: New test.
* gcc.misc-tests/gcov-27.c: New test.
* gcc.misc-tests/gcov-28.c: New test.
---
 gcc/doc/gcov.texi  | 113 ++
 gcc/gcov.cc| 128 +++--
 gcc/testsuite/g++.dg/gcov/gcov-19.C|  35 +++
 gcc/testsuite/g++.dg/gcov/gcov-20.C|  38 
 gcc/testsuite/g++.dg/gcov/gcov-21.C|  32 +++
 gcc/testsuite/gcc.misc-tests/gcov-25.c |  25 +
 gcc/testsuite/gcc.misc-tests/gcov-26.c |  25 +
 gcc/testsuite/gcc.misc-tests/gcov-27.c |  24 +
 gcc/testsuite/gcc.misc-tests/gcov-28.c |  22 +
 gcc/testsuite/lib/gcov.exp |  53 ++

[PATCH][c++ frontend]: check for missing condition for novector [PR115623]

2024-06-25 Thread Tamar Christina
Hi All,

It looks like I forgot to check in the C++ frontend if a condition exist for the
loop being adorned with novector.  This causes a segfault because cond isn't
expected to be null.

This fixes it by issuing the same kind of diagnostics we issue for the other
pragmas.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master? and backport to GCC-14?

Thanks,
Tamar

gcc/cp/ChangeLog:

PR c++/115623
* parser.cc (cp_parser_c_for): Add check for C++ cond.

gcc/testsuite/ChangeLog:

PR c++/115623
* g++.dg/vect/vect-novector-pragma_2.cc: New test.

---
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 
e7409b856f1127e303c6515a3bb2d61a10e7c378..24d7b0e4992fdff69951ac5955f304e473f53374
 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -14107,6 +14107,12 @@ cp_parser_c_for (cp_parser *parser, tree scope, tree 
init, bool ivdep,
   "% pragma");
   condition = error_mark_node;
 }
+  else if (novector)
+{
+  cp_parser_error (parser, "missing loop condition in loop with "
+  "% pragma");
+  condition = error_mark_node;
+}
   finish_for_cond (condition, stmt, ivdep, unroll, novector);
   /* Look for the `;'.  */
   cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON);
diff --git a/gcc/testsuite/g++.dg/vect/vect-novector-pragma_2.cc 
b/gcc/testsuite/g++.dg/vect/vect-novector-pragma_2.cc
new file mode 100644
index 
..05dba4db1c6544bc53cd05482d1b2e767052cf43
--- /dev/null
+++ b/gcc/testsuite/g++.dg/vect/vect-novector-pragma_2.cc
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+
+void f (char *a, int i)
+{
+#pragma GCC novector
+  for (;;i++)
+a[i] *= 2;
+}
+
+/* { dg-error "missing loop condition in loop with 'GCC novector' pragma 
before ';' token" "" { target *-*-* } 6 } */




-- 
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index e7409b856f1127e303c6515a3bb2d61a10e7c378..24d7b0e4992fdff69951ac5955f304e473f53374 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -14107,6 +14107,12 @@ cp_parser_c_for (cp_parser *parser, tree scope, tree init, bool ivdep,
 		   "% pragma");
   condition = error_mark_node;
 }
+  else if (novector)
+{
+  cp_parser_error (parser, "missing loop condition in loop with "
+		   "% pragma");
+  condition = error_mark_node;
+}
   finish_for_cond (condition, stmt, ivdep, unroll, novector);
   /* Look for the `;'.  */
   cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON);
diff --git a/gcc/testsuite/g++.dg/vect/vect-novector-pragma_2.cc b/gcc/testsuite/g++.dg/vect/vect-novector-pragma_2.cc
new file mode 100644
index ..05dba4db1c6544bc53cd05482d1b2e767052cf43
--- /dev/null
+++ b/gcc/testsuite/g++.dg/vect/vect-novector-pragma_2.cc
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+
+void f (char *a, int i)
+{
+#pragma GCC novector
+  for (;;i++)
+a[i] *= 2;
+}
+
+/* { dg-error "missing loop condition in loop with 'GCC novector' pragma before ';' token" "" { target *-*-* } 6 } */





[PATCH 1/3] Release structures on function return

2024-06-25 Thread Jørgen Kvalsvik
The value vec objects are destroyed on exit, but release still needs to
be called explicitly.

gcc/ChangeLog:

* tree-profile.cc (find_conditions): Release vectors before
  return.
---
 gcc/tree-profile.cc | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
index e4bb689cef5..18f48e8d04e 100644
--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -919,6 +919,9 @@ find_conditions (struct function *fn)
 if (!have_post_dom)
free_dominance_info (fn, CDI_POST_DOMINATORS);
 
+for (auto expr : exprs)
+  expr.second.release ();
+
 cov->m_masks.safe_grow_cleared (2 * cov->m_index.last ());
 const size_t length = cov_length (cov);
 for (size_t i = 0; i != length; i++)
-- 
2.39.2



[PATCH 2/3] Add section on MC/DC in gcov manual

2024-06-25 Thread Jørgen Kvalsvik
gcc/ChangeLog:

* doc/gcov.texi: Add MC/DC section.
---
 gcc/doc/gcov.texi | 72 +++
 1 file changed, 72 insertions(+)

diff --git a/gcc/doc/gcov.texi b/gcc/doc/gcov.texi
index dc79bccb8cf..a9221738cce 100644
--- a/gcc/doc/gcov.texi
+++ b/gcc/doc/gcov.texi
@@ -917,6 +917,78 @@ of times the call was executed will be printed.  This will 
usually be
 100%, but may be less for functions that call @code{exit} or @code{longjmp},
 and thus may not return every time they are called.
 
+When you use the @option{-g} option, your output looks like this:
+
+@smallexample
+$ gcov -t -m -g tmp
+-:0:Source:tmp.cpp
+-:0:Graph:tmp.gcno
+-:0:Data:tmp.gcda
+-:0:Runs:1
+-:1:#include 
+-:2:
+-:3:int
+1:4:main (void)
+-:5:@{
+-:6:  int i, total;
+1:7:  total = 0;
+-:8:
+   11:9:  for (i = 0; i < 10; i++)
+condition outcomes covered 2/2
+   10:   10:total += i;
+-:   11:
+   1*:   12:  int v = total > 100 ? 1 : 2;
+condition outcomes covered 1/2
+condition  0 not covered (true)
+-:   13:
+   1*:   14:  if (total != 45 && v == 1)
+condition outcomes covered 1/4
+condition  0 not covered (true)
+condition  1 not covered (true false)
+#:   15:printf ("Failure\n");
+-:   16:  else
+1:   17:printf ("Success\n");
+1:   18:  return 0;
+-:   19:@}
+@end smallexample
+
+For every condition the number of taken and total outcomes are
+printed, and if there are uncovered outcomes a line will be printed
+for each condition showing the uncovered outcome in parentheses.
+Conditions are identified by their index -- index 0 is the left-most
+condition.  In @code{a || (b && c)}, @var{a} is condition 0, @var{b}
+condition 1, and @var{c} condition 2.
+
+An outcome is considered covered if it has an independent effect on
+the decision, also known as masking MC/DC (Modified Condition/Decision
+Coverage).  In this example the decision evaluates to true and @var{a}
+is evaluated, but not covered.  This is because @var{a} cannot affect
+the decision independently -- both @var{a} and @var{b} must change
+value for the decision to change.
+
+@smallexample
+$ gcov -t -m -g tmp
+-:0:Source:tmp.c
+-:0:Graph:tmp.gcno
+-:0:Data:tmp.gcda
+-:0:Runs:1
+-:1:#include 
+-:2:
+1:3:int main()
+-:4:@{
+1:5:  int a = 1;
+1:6:  int b = 0;
+-:7:
+1:8:  if (a && b)
+condition outcomes covered 1/4
+condition  0 not covered (true false)
+condition  1 not covered (true)
+#:9:printf ("Success!\n");
+-:   10:  else
+1:   11:printf ("Failure!\n");
+-:   12:@}
+@end smallexample
+
 The execution counts are cumulative.  If the example program were
 executed again without removing the @file{.gcda} file, the count for the
 number of times each line in the source was executed would be added to
-- 
2.39.2



[PATCH 3/3] Use the term MC/DC in help for gcov --conditions

2024-06-25 Thread Jørgen Kvalsvik
Without key terms like "masking" and "MC/DC" it is not at all obvious
what --conditions actually reports on, and there is no easy path for the
user to figure out. By at least including the two key terms MC/DC and
masking users have something to search for.

gcc/ChangeLog:

* gcov.cc (print_usage): Reference masking MC/DC.
---
 gcc/gcov.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/gcov.cc b/gcc/gcov.cc
index f6787f0be8f..1e2e193d79d 100644
--- a/gcc/gcov.cc
+++ b/gcc/gcov.cc
@@ -1015,7 +1015,7 @@ print_usage (int error_p)
   fnotice (file, "  -c, --branch-counts Output counts of branches 
taken\n\
 rather than percentages\n");
   fnotice (file, "  -g, --conditionsInclude modified 
condition/decision\n\
-coverage in output\n");
+coverage (masking MC/DC) in output\n");
   fnotice (file, "  -d, --display-progress  Display progress 
information\n");
   fnotice (file, "  -D, --debugDisplay debugging 
dumps\n");
   fnotice (file, "  -f, --function-summariesOutput summaries for each 
function\n");
-- 
2.39.2



[PATCH 1/2] Record edge true/false value for gcov

2024-06-25 Thread Jørgen Kvalsvik
Make gcov aware which edges are the true/false to more accurately
reconstruct the CFG.  There are plenty of bits left in arc_info and it
opens up for richer reporting.

gcc/ChangeLog:

* gcov-io.h (GCOV_ARC_TRUE): New.
(GCOV_ARC_FALSE): New.
* gcov.cc (struct arc_info): Add true_value, false_value.
(read_graph_file): Read true_value, false_value.
* profile.cc (branch_prob): Write GCOV_ARC_TRUE, GCOV_ARC_FALSE.
---
 gcc/gcov-io.h  | 2 ++
 gcc/gcov.cc| 8 
 gcc/profile.cc | 4 
 3 files changed, 14 insertions(+)

diff --git a/gcc/gcov-io.h b/gcc/gcov-io.h
index 20f805598f0..5dc467c92b1 100644
--- a/gcc/gcov-io.h
+++ b/gcc/gcov-io.h
@@ -337,6 +337,8 @@ GCOV_COUNTERS
 #define GCOV_ARC_ON_TREE   (1 << 0)
 #define GCOV_ARC_FAKE  (1 << 1)
 #define GCOV_ARC_FALLTHROUGH   (1 << 2)
+#define GCOV_ARC_TRUE  (1 << 3)
+#define GCOV_ARC_FALSE (1 << 4)
 
 /* Object & program summary record.  */
 
diff --git a/gcc/gcov.cc b/gcc/gcov.cc
index 1e2e193d79d..b9e41fd5172 100644
--- a/gcc/gcov.cc
+++ b/gcc/gcov.cc
@@ -117,6 +117,12 @@ struct arc_info
   /* Loop making arc.  */
   unsigned int cycle : 1;
 
+  /* Is a true arc.  */
+  unsigned int true_value : 1;
+
+  /* Is a false arc.  */
+  unsigned int false_value : 1;
+
   /* Links to next arc on src and dst lists.  */
   struct arc_info *succ_next;
   struct arc_info *pred_next;
@@ -2095,6 +2101,8 @@ read_graph_file (void)
  arc->on_tree = !!(flags & GCOV_ARC_ON_TREE);
  arc->fake = !!(flags & GCOV_ARC_FAKE);
  arc->fall_through = !!(flags & GCOV_ARC_FALLTHROUGH);
+ arc->true_value = !!(flags & GCOV_ARC_TRUE);
+ arc->false_value = !!(flags & GCOV_ARC_FALSE);
 
  arc->succ_next = src_blk->succ;
  src_blk->succ = arc;
diff --git a/gcc/profile.cc b/gcc/profile.cc
index 2b90e6cc510..25d4f4a4b86 100644
--- a/gcc/profile.cc
+++ b/gcc/profile.cc
@@ -1456,6 +1456,10 @@ branch_prob (bool thunk)
flag_bits |= GCOV_ARC_FAKE;
  if (e->flags & EDGE_FALLTHRU)
flag_bits |= GCOV_ARC_FALLTHROUGH;
+ if (e->flags & EDGE_TRUE_VALUE)
+   flag_bits |= GCOV_ARC_TRUE;
+ if (e->flags & EDGE_FALSE_VALUE)
+   flag_bits |= GCOV_ARC_FALSE;
  /* On trees we don't have fallthru flags, but we can
 recompute them from CFG shape.  */
  if (e->flags & (EDGE_TRUE_VALUE | EDGE_FALSE_VALUE)
-- 
2.39.2



[PATCH 2/2] [RFC] Prime path coverage in gcc/gcov

2024-06-25 Thread Jørgen Kvalsvik
These are the main highlights since v1:

1. The instrumentation phase has been reworked and tries to eliminate
   redundant instructions. This has a massive impact on performance,
   taking the compile time of tree.c from 13m30s-14m to ~2m30s on my
   machine, and the resulting tree.o from ~50M to 32M in size.

2. Some bugfixes and more tests.

3. Some temporary hacks been removed and replaced with more permanent
   solutions. Notable examples are assumptions on bits-per-unit and no
   more use of the C++ STL.

I still need to implement knobs and options, and work a bit more on the
reporting.

Compile times are still brutal, but much better than before. It doesn't
seem like I emit more instructions than I absolutely have to (but I
would like a counter example here). I don't know if there is too much
that can be done about this other than general speed improvements in
verify-ssa, verify-gimple, verify-control-flow and the likes.

--

This patch adds prime path coverage to gcc/gcov. It is a bit rough in a few
places, but I think all the main components are there and ready for some
feedback while I keep working on the details. First, a quick
introduction to path coverage, before I explain a bit on the pieces of
the patch and on what's missing.

PRIME PATHS

Path coverage is recording the paths taken through the program. Here is a
simple example:

if (cond1)  BB 1
  then1 ()  BB 2
else
  else1 ()  BB 3

if (cond2)  BB 4
  then2 ()  BB 5
else
  else2 ()  BB 6

_   BB 7

To cover all paths you must run {then1 then2}, {then1 else2}, {else1 then1},
{else1 else2}. This is in contrast with line/statement coverage where it is
sufficient to execute then2, and it does not matter if it was reached through
then1 or else1.

1 2 4 5 7
1 2 4 6 7
1 3 4 5 7
1 3 4 6 7

This gets more complicated with loops, because 0, 1, 2, ..., N iterations are
all different paths. There are different ways of addressing this, a promising
one being prime paths. A prime path is a simple path (a path with no repeated
vertices except for the first/last in a cycle) that does not appear as a subpath
of any other simple path. Prime paths seem to strike a decent balance between
number of tests, path growth, and loop coverage. Of course, the number of paths
still grows very fast with program complexity - for example, this program has
14 prime paths:

  while (a)
{
  if (b)
return;
  while (c--)
a++;
}

--

ALGORITHM

Since the numbers of paths grows so fast, we need a good algorithm. The naive
approach of generating all paths and discarding redundancies (see
reference_prime_paths in the diff) simply doesn't complete for even pretty
simple functions with a few ten thousand paths (granted, the implementation is
also poor, but only serves as a reference). Fazli & Afsharchi in their paper
"Time and Space-Efficient Compositional Method for Prime and Test Paths
Generation from describe a neat algorithm which drastically improves on this
and brings complexity down to something managable. This patch implements that
algorithm with a few minor tweaks.

The algorithm first finds the strongly connected components (SCC) of the graph
and creates a new graph where the vertices are the SCCs of the CFG. Within
these vertices different paths are found - regular prime paths, paths that
start in the SCCs entries, and paths that end in the SCCs exits. These per-SCC
paths are combined with paths through the CFG which greatly reduces of paths
needed to be evaluated just to be thrown away.

Using this algorithm we can generate the prime paths for somewhat complicated
functions in a reasonable time. This is the prime_paths function. Please note
that some paths don't benefit from this at all. We need to find the prime paths
within a SCC, so if a single SCC is very large the function degenerates to the
naive implementation. Improving on this is a later project.

--

OVERALL ARCHITECTURE

Like the other coverages in gcc, this operates on the CFG in the profiling
phase, just after branch and condition coverage, in phases:

1. All prime paths are generated, counted, and enumerated from the CFG
2. The paths are evaluted and counter instructions and accumulators are
   emitted
3. gcov reads the CFG and computes the prime paths (same as step 1)
4. gcov prints a report

Simply writing out all the paths in the .gcno file is not really viable,
the files would be too big. Additionally, there are limits to the
practicality of measuring (and reporting) on millions of paths, so for
most programs where coverage is feasible, computing paths should be
plenty fast. As a result, path coverage really only adds 1 bit to the
counter, rounded up to nearest 64 ("bucket"), so 64 paths takes up 8
bytes, 65 paths take up 16 bytes.

Recording paths is really just massaging large bitsets. Per function,
ceil(paths/64 or 32) buckets (gcov_type) are allocated. Paths are
sorted, so the first path maps to the lowest bit, the second path to the
second lowest bit, and so o

Re: [PATCH 05/52] rust: Replace uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE

2024-06-25 Thread Arthur Cohen

Hi Kewen,

Sorry for not answering earlier - yes, this sounds good to me :) thanks 
for taking care of that.


Best,

Arthur

On 6/21/24 12:36, Kewen.Lin wrote:

Hi Arthur,

on 2024/6/21 18:17, Arthur Cohen wrote:

Hi,

Sorry about the delay in my answer! The patch looks good to me :) Will you push 
it as part of your patchset?



Thanks for the review!  Since this one doesn't necessarily depend on
"09/52 Replace {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE with new hook
mode_for_floating_type", I'm going to push this before that (just like
the other FE changes excepting for the jit one 10/52 which depends on
the new hook 09/52).  btw, all after 09/52 would be merged into 09/52
when committing. :)

Does it sound good to you?

BR,
Kewen


Kindly,

Arthur

On 6/3/24 05:00, Kewen Lin wrote:

Joseph pointed out "floating types should have their mode,
not a poorly defined precision value" in the discussion[1],
as he and Richi suggested, the existing macros
{FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE will be replaced with a
hook mode_for_floating_type.  To be prepared for that, this
patch is to replace use of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE
in rust with TYPE_PRECISION of {float,{,long_}double}_type_node.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651209.html

gcc/rust/ChangeLog:

 * rust-gcc.cc (float_type): Use TYPE_PRECISION of
 {float,double,long_double}_type_node to replace
 {FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE.
---
   gcc/rust/rust-gcc.cc | 6 +++---
   1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/rust/rust-gcc.cc b/gcc/rust/rust-gcc.cc
index f17e19a2dfc..38169c08985 100644
--- a/gcc/rust/rust-gcc.cc
+++ b/gcc/rust/rust-gcc.cc
@@ -411,11 +411,11 @@ tree
   float_type (int bits)
   {
     tree type;
-  if (bits == FLOAT_TYPE_SIZE)
+  if (bits == TYPE_PRECISION (float_type_node))
   type = float_type_node;
-  else if (bits == DOUBLE_TYPE_SIZE)
+  else if (bits == TYPE_PRECISION (double_type_node))
   type = double_type_node;
-  else if (bits == LONG_DOUBLE_TYPE_SIZE)
+  else if (bits == TYPE_PRECISION (long_double_type_node))
   type = long_double_type_node;
     else
   {


[PATCH ver3] rs6000, altivec-1-runnable.c update the, require-effective-target

2024-06-25 Thread Carl Love
GCC maintainers:

version 3, rebased on current mainline tree.  Version 2 of the patch was out of 
sync. Retested the patch on 
Power 10 with no regressions.

version 2, update the dg options per the feedback.  Retested the patch on Power 
10 with no regressions.

This patch updates the dg options.

The patch has been tested on Power 10 with no regression failures.

Please let me know if this patch is acceptable for mainline.  Thanks.

Carl 



rs6000, altivec-1-runnable.c update the require-effective-target

Update the dg test directives.

gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog:
* gcc.target/powerpc/altivec-1-runnable.c: Change the
require-effective-target for the test.
---
 gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c
index 4e32860a169..6763ff3ff8b 100644
--- a/gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c
@@ -1,7 +1,9 @@
-/* { dg-do compile { target powerpc*-*-* } } */
-/* { dg-options "-maltivec" } */
+/* { dg-do run { target vmx_hw } } */
+/* { dg-do compile { target { ! vmx_hw } } } */
+/* { dg-options "-O2 -maltivec" } */
 /* { dg-require-effective-target powerpc_altivec } */
 
+
 #include 
 
 #ifdef DEBUG
-- 
2.45.0



Re: [PATCH 04/52] go: Replace uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE

2024-06-25 Thread Kewen.Lin
on 2024/6/12 20:32, Ian Lance Taylor wrote:
> "Kewen.Lin"  writes:
> 
>> Hi,
>>
>> Gentle ping:
>>
>> https://gcc.gnu.org/pipermail/gcc-patches/2024-June/653387.html
>>
>> BR,
>> Kewen
>>
>> on 2024/6/3 11:00, Kewen Lin wrote:
>>> Joseph pointed out "floating types should have their mode,
>>> not a poorly defined precision value" in the discussion[1],
>>> as he and Richi suggested, the existing macros
>>> {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE will be replaced with a
>>> hook mode_for_floating_type.  To be prepared for that, this
>>> patch is to replace use of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE
>>> in go with TYPE_PRECISION of {float,{,long_}double}_type_node.
>>>
>>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651209.html
>>>
>>> gcc/go/ChangeLog:
>>>
>>> * go-gcc.cc (Gcc_backend::float_type): Use TYPE_PRECISION of
>>> {float,double,long_double}_type_node to replace
>>> {FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE.
>>> (Gcc_backend::complex_type): Likewise.
> 
> This is fine if the other parts of the patch are accepted.

Thanks, pushed as r15-1591-gfafd87830937d5.

BR,
Kewen



Re: [PATCH version 2] rs6000, altivec-1-runnable.c update the, require-effective-target

2024-06-25 Thread Carl Love
Kewen:

On 6/23/24 19:41, Kewen.Lin wrote:
> Hi,
> 
> on 2024/6/22 00:15, Carl Love wrote:
>> GCC maintainers:
>>
>> version 2, update the dg options per the feedback.  Retested the patch on 
>> Power 10 with no regressions.
>>
>> This patch updates the dg options.
>>
>> The patch has been tested on Power 10 with no regression failures.
>>
>> Please let me know if this patch is acceptable for mainline.  Thanks.
>>
>> Carl 
>>
>> -- 
>> rs6000, altivec-1-runnable.c update the require-effective-target
>>
>> Update the dg test directives.
>>
>> gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog:
>>  * gcc.target/powerpc/altivec-1-runnable.c: Change the
>>  require-effective-target for the test.
>> ---
>>  gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c | 7 ---
>>  1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c 
>> b/gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c
>> index da8ebbc30ba..3f084c91798 100644
>> --- a/gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c
>> @@ -1,6 +1,7 @@
>> -/* { dg-do compile { target powerpc*-*-* } } */
>> -/* { dg-require-effective-target powerpc_altivec_ok } */
>> -/* { dg-options "-maltivec" } */
>> +/* { dg-do run { target vmx_hw } } */
>> +/* { dg-do compile { target { ! vmx_hw } } } */
>> +/* { dg-options "-O2 -maltivec" } */
>> +/* { dg-require-effective-target powerpc_altivec } */
> 
> This one needs rebasing, "powerpc_altivec" has been adjusted on trunk.

Yes, this seems to be out of sync.  I will rebase on the current upstream tree 
and re-post.

 Carl  


Re: [PATCH 09/52 v2] Replace {FLOAT, {, LONG_}DOUBLE}_TYPE_SIZE with new hook mode_for_floating_type

2024-06-25 Thread Kewen.Lin
Hi all,

I just pushed this 09/52 v2 with its following target changes
as r15-1594, thanks a lot for your comments/reviews/approvals!

BR,
Kewen

> Subject: [PATCH 09/52] Replace {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE with new hook
>  mode_for_floating_type
> 
> Currently how we determine which mode will be used for a
> floating point type is that for a given type precision
> (size) call mode_for_size to get the first mode which has
> this size in the specified class.  On Powerpc, we have
> three modes (TF/KF/IF) having the same mode precision 128
> (see[1]), so the processing forces us to have to place TF
> at the first place, it would require us to make more
> adjustment in some generic code to avoid some unexpected
> mode conversions and it would be even worse if we get rid
> of TF eventually one day.  And as Joseph pointed out in [2],
> "floating types should have their mode, not a poorly
> defined precision value", as Joseph and Richi suggested,
> this patch is to introduce one hook mode_for_floating_type
> which returns the corresponding mode for type float, double
> or long double.  The default implementation returns SFmode
> for float and DFmode for double or long double.  For ports
> which need special treatment, there are some other patches
> for their own port specific implementation (referring to
> how {,LONG_}DOUBLE_TYPE_SIZE get used there).  For all
> generic uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE, depending
> on the context, some of them are replaced with TYPE_PRECISION
> of the according type node, some other are replaced with
> GET_MODE_PRECISION on the mode from mode_for_floating_type.
> This patch also poisons {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE,
> so most defines of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE in port
> specific are removed, but there are still some which are
> good to be kept for readability then they get renamed with
> port specific prefix.
> 
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651017.html
> [2] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651209.html
> 
> gcc/ChangeLog:
> 
>   * coretypes.h (enum tree_index): Forward declaration.
>   * defaults.h (FLOAT_TYPE_SIZE): Remove.
>   (DOUBLE_TYPE_SIZE): Likewise.
>   (LONG_DOUBLE_TYPE_SIZE): Likewise.
>   * doc/rtl.texi: Update document by replacing {FLOAT,DOUBLE}_TYPE_SIZE
>   with C type {float,double}.
>   * doc/tm.texi.in: Document new hook mode_for_floating_type, remove
>   document entries for {FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE and
>   update document for WIDEST_HARDWARE_FP_SIZE.
>   * doc/tm.texi: Regenerate.
>   * emit-rtl.cc (init_emit_once): Replace DOUBLE_TYPE_SIZE by
>   calling targetm.c.mode_for_floating_type with TI_DOUBLE_TYPE.
>   * real.h (REAL_VALUE_TO_TARGET_LONG_DOUBLE): Use TYPE_PRECISION of
>   long_double_type_node to replace LONG_DOUBLE_TYPE_SIZE.
>   * system.h (FLOAT_TYPE_SIZE): Poison.
>   (DOUBLE_TYPE_SIZE): Likewise.
>   (LONG_DOUBLE_TYPE_SIZE): Likewise.
>   * target.def (mode_for_floating_type): New hook.
>   * targhooks.cc (default_mode_for_floating_type): New function.
>   (default_scalar_mode_supported_p): Update macros
>   {FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE by calling
>   targetm.c.mode_for_floating_type with
>   TI_{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE.
>   * targhooks.h (default_mode_for_floating_type): New declaration.
>   * tree-core.h (enum tree_index): Specify underlying type unsigned
>   to sync with forward declaration in coretypes.h.
>   (NUM_FLOATN_TYPES): Explicitly convert to int.
>   (NUM_FLOATNX_TYPES): Likewise.
>   (NUM_FLOATN_NX_TYPES): Likewise.
>   * tree.cc (build_common_tree_nodes): Update macros
>   {FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE by calling
>   targetm.c.mode_for_floating_type with
>   TI_{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE and set type mode accordingly.
> ---
>  gcc/coretypes.h|  1 +
>  gcc/defaults.h | 12 
>  gcc/doc/rtl.texi   |  2 +-
>  gcc/doc/tm.texi| 33 +
>  gcc/doc/tm.texi.in | 27 +++
>  gcc/emit-rtl.cc|  3 ++-
>  gcc/real.h |  7 ---
>  gcc/system.h   |  3 ++-
>  gcc/target.def |  9 +
>  gcc/targhooks.cc   | 18 +++---
>  gcc/targhooks.h|  1 +
>  gcc/tree-core.h| 13 +++--
>  gcc/tree.cc| 18 +++---
>  13 files changed, 77 insertions(+), 70 deletions(-)
> 
> diff --git a/gcc/coretypes.h b/gcc/coretypes.h
> index 1ac6f0abea3..00c1c58bd8c 100644
> --- a/gcc/coretypes.h
> +++ b/gcc/coretypes.h
> @@ -100,6 +100,7 @@ struct gimple;
>  typedef gimple *gimple_seq;
>  struct gimple_stmt_iterator;
>  class code_helper;
> +enum tree_index : unsigned;
> 
>  /* Forward declare rtx_code, so that we can use it in target hooks without
> needing to pull in rtl.h.  */
> diff --git a/gcc/defaults.h b/gcc/defaults.h
> index 92f3e07f742..ac2d25852ab 100644
> --- a/gcc/defaults.h
> 

Re: [PATCH 3/8] Make more use of force_subreg

2024-06-25 Thread Richard Sandiford
Jeff Law  writes:
> On 6/17/24 3:53 AM, Richard Sandiford wrote:
>> This patch makes target-independent code use force_subreg instead
>> of simplify_gen_subreg in some places.  The criteria were:
>> 
>> (1) The code is obviously specific to expand (where new pseudos
>>  can be created), or at least would be invalid to call when
>>  !can_create_pseudo_p () and temporaries are needed.
>> 
>> (2) The value is obviously an rvalue rather than an lvalue.
>> 
>> (3) The offset wasn't a simple lowpart or highpart calculation;
>>  a later patch will deal with those.
>> 
>> Doing this should reduce the likelihood of bugs like PR115464
>> occuring in other situations.
>> 
>> gcc/
>>  * expmed.cc (store_bit_field_using_insv): Use force_subreg
>>  instead of simplify_gen_subreg.
>>  (store_bit_field_1): Likewise.
>>  (extract_bit_field_as_subreg): Likewise.
>>  (extract_integral_bit_field): Likewise.
>>  (emit_store_flag_1): Likewise.
>>  * expr.cc (convert_move): Likewise.
>>  (convert_modes): Likewise.
>>  (emit_group_load_1): Likewise.
>>  (emit_group_store): Likewise.
>>  (expand_assignment): Likewise.
> [ ... ]
>
> So this has triggered a failure on ft32-elf with this testcase 
> (simplified from the testsuite):
>
> typedef _Bool bool;
> const bool false = 0;
> const bool true = 1;
>
> struct RenderBox
> {
>bool m_positioned : 1;
> };
>
> typedef struct RenderBox RenderBox;
>
>
> void RenderBox_setStyle(RenderBox *thisin)
> {
>RenderBox *this = thisin;
>bool ltrue = true;
>this->m_positioned = ltrue;
>
> }
>
>
>
> Before this change we generated this:
>
>> (insn 13 12 14 (set (reg:QI 47)
>> (mem/c:QI (plus:SI (reg/f:SI 37 virtual-stack-vars)
>> (const_int -5 [0xfffb])) [1 ltrue+0 S1 A8])) 
>> "j.c":17:22 -1
>>  (nil))
>> 
>> (insn 14 13 15 (parallel [
>> (set (zero_extract:SI (subreg:SI (reg:QI 46) 0)
>> (const_int 1 [0x1])
>> (const_int 0 [0]))
>> (subreg:SI (reg:QI 47) 0))
>> (clobber (scratch:SI))
>> ]) "j.c":17:22 -1
>>  (nil))
>
>
> Afterwards we generate:
>
>> (insn 13 12 14 2 (parallel [
>> (set (zero_extract:SI (subreg:SI (reg:QI 46) 0)
>> (const_int 1 [0x1])
>> (const_int 0 [0]))
>> (subreg:SI (mem/c:QI (plus:SI (reg/f:SI 37 
>> virtual-stack-vars)
>> (const_int -5 [0xfffb])) [1 ltrue+0 
>> S1 A8]) 0))
>> (clobber (scratch:SI))
>> ]) "j.c":17:22 -1
>>  (nil))
>
> Note the (subreg (mem (...)).  Probably not desirable in general, but 
> also note the virtual-stack-vars in the memory address.  The code to 
> instantiate virtual registers doesn't handle (subreg (mem)), so we never 
> convert that to an FP based address and we eventually fault.
>
> Should be visible with ft32-elf cross compiler.  No options needed.

Bah.  Thanks for the report.

I agree of course with the follow-on discussion that we should get
rid of (subreg (mem)).  But this was supposed to be a conservative
patch.  I've therefore reverted the offending part of the commit,
as below.  (Tested on aarch64-linux-gnu.)

Richard


One of the changes in g:d4047da6a070175aae7121c739d1cad6b08ff4b2
caused a regression in ft32-elf; see:

https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655418.html

for details.  This change was different from the others in that the
original call was to simplify_subreg rather than simplify_lowpart_subreg.
The old code would therefore go on to do the force_reg for more cases
than the new code would.

gcc/
* expmed.cc (store_bit_field_using_insv): Revert earlier change
to use force_subreg instead of simplify_gen_subreg.
---
 gcc/expmed.cc | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/expmed.cc b/gcc/expmed.cc
index 3b9475f5aa0..8bbbc94a98c 100644
--- a/gcc/expmed.cc
+++ b/gcc/expmed.cc
@@ -695,7 +695,13 @@ store_bit_field_using_insv (const extraction_insn *insv, 
rtx op0,
 if we must narrow it, be sure we do it correctly.  */
 
  if (GET_MODE_SIZE (value_mode) < GET_MODE_SIZE (op_mode))
-   tmp = force_subreg (op_mode, value1, value_mode, 0);
+   {
+ tmp = simplify_subreg (op_mode, value1, value_mode, 0);
+ if (! tmp)
+   tmp = simplify_gen_subreg (op_mode,
+  force_reg (value_mode, value1),
+  value_mode, 0);
+   }
  else
{
  if (targetm.mode_rep_extended (op_mode, value_mode) != UNKNOWN)
-- 
2.25.1



Re: [PATCH 6/6] Add a late-combine pass [PR106594]

2024-06-25 Thread Thomas Schwinge
Hi!

On 2024-06-20T14:34:18+0100, Richard Sandiford  
wrote:
> This patch adds a combine pass that runs late in the pipeline.
> [...]

Nice!

> The patch [...] disables the pass by default on i386, rs6000
> and xtensa.

Like here:

> --- a/gcc/config/i386/i386-options.cc
> +++ b/gcc/config/i386/i386-options.cc
> @@ -1942,6 +1942,10 @@ ix86_override_options_after_change (void)
>   flag_cunroll_grow_size = flag_peel_loops || optimize >= 3;
>  }
>  
> +  /* Late combine tends to undo some of the effects of STV and RPAD,
> + by combining instructions back to their original form.  */
> +  if (!OPTION_SET_P (flag_late_combine_instructions))
> +flag_late_combine_instructions = 0;
>  }

..., I think also here:

> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -4768,6 +4768,14 @@ rs6000_option_override_internal (bool global_init_p)
>   targetm.expand_builtin_va_start = NULL;
>  }
>  
> +  /* One of the late-combine passes runs after register allocation
> + and can match define_insn_and_splits that were previously used
> + only before register allocation.  Some of those define_insn_and_splits
> + use gen_reg_rtx unconditionally.  Disable late-combine by default
> + until the define_insn_and_splits are fixed.  */
> +  if (!OPTION_SET_P (flag_late_combine_instructions))
> +flag_late_combine_instructions = 0;
> +
>rs6000_override_options_after_change ();

..., this needs to be done in 'rs6000_override_options_after_change'
instead of 'rs6000_option_override_internal', to address the PRs under
discussion.  I'm testing such a patch.


Grüße
 Thomas


Re: [PATCH 6/6] Add a late-combine pass [PR106594]

2024-06-25 Thread Richard Sandiford
Thomas Schwinge  writes:
> Hi!
>
> On 2024-06-20T14:34:18+0100, Richard Sandiford  
> wrote:
>> This patch adds a combine pass that runs late in the pipeline.
>> [...]
>
> Nice!
>
>> The patch [...] disables the pass by default on i386, rs6000
>> and xtensa.
>
> Like here:
>
>> --- a/gcc/config/i386/i386-options.cc
>> +++ b/gcc/config/i386/i386-options.cc
>> @@ -1942,6 +1942,10 @@ ix86_override_options_after_change (void)
>>  flag_cunroll_grow_size = flag_peel_loops || optimize >= 3;
>>  }
>>  
>> +  /* Late combine tends to undo some of the effects of STV and RPAD,
>> + by combining instructions back to their original form.  */
>> +  if (!OPTION_SET_P (flag_late_combine_instructions))
>> +flag_late_combine_instructions = 0;
>>  }
>
> ..., I think also here:
>
>> --- a/gcc/config/rs6000/rs6000.cc
>> +++ b/gcc/config/rs6000/rs6000.cc
>> @@ -4768,6 +4768,14 @@ rs6000_option_override_internal (bool global_init_p)
>>  targetm.expand_builtin_va_start = NULL;
>>  }
>>  
>> +  /* One of the late-combine passes runs after register allocation
>> + and can match define_insn_and_splits that were previously used
>> + only before register allocation.  Some of those define_insn_and_splits
>> + use gen_reg_rtx unconditionally.  Disable late-combine by default
>> + until the define_insn_and_splits are fixed.  */
>> +  if (!OPTION_SET_P (flag_late_combine_instructions))
>> +flag_late_combine_instructions = 0;
>> +
>>rs6000_override_options_after_change ();
>
> ..., this needs to be done in 'rs6000_override_options_after_change'
> instead of 'rs6000_option_override_internal', to address the PRs under
> discussion.  I'm testing such a patch.

Oops!  Sorry about that, and thanks for tracking it down.

Richard


rs6000: Properly default-disable late-combine passes [PR106594, PR115622, PR115633] (was: [PATCH 6/6] Add a late-combine pass [PR106594])

2024-06-25 Thread Thomas Schwinge
Hi!

On 2024-06-25T10:07:47+0100, Richard Sandiford  
wrote:
> Thomas Schwinge  writes:
>> On 2024-06-20T14:34:18+0100, Richard Sandiford  
>> wrote:
>>> This patch adds a combine pass that runs late in the pipeline.
>>> [...]
>>
>> Nice!
>>
>>> The patch [...] disables the pass by default on i386, rs6000
>>> and xtensa.
>>
>> Like here:
>>
>>> --- a/gcc/config/i386/i386-options.cc
>>> +++ b/gcc/config/i386/i386-options.cc
>>> @@ -1942,6 +1942,10 @@ ix86_override_options_after_change (void)
>>> flag_cunroll_grow_size = flag_peel_loops || optimize >= 3;
>>>  }
>>>  
>>> +  /* Late combine tends to undo some of the effects of STV and RPAD,
>>> + by combining instructions back to their original form.  */
>>> +  if (!OPTION_SET_P (flag_late_combine_instructions))
>>> +flag_late_combine_instructions = 0;
>>>  }
>>
>> ..., I think also here:
>>
>>> --- a/gcc/config/rs6000/rs6000.cc
>>> +++ b/gcc/config/rs6000/rs6000.cc
>>> @@ -4768,6 +4768,14 @@ rs6000_option_override_internal (bool global_init_p)
>>> targetm.expand_builtin_va_start = NULL;
>>>  }
>>>  
>>> +  /* One of the late-combine passes runs after register allocation
>>> + and can match define_insn_and_splits that were previously used
>>> + only before register allocation.  Some of those define_insn_and_splits
>>> + use gen_reg_rtx unconditionally.  Disable late-combine by default
>>> + until the define_insn_and_splits are fixed.  */
>>> +  if (!OPTION_SET_P (flag_late_combine_instructions))
>>> +flag_late_combine_instructions = 0;
>>> +
>>>rs6000_override_options_after_change ();
>>
>> ..., this needs to be done in 'rs6000_override_options_after_change'
>> instead of 'rs6000_option_override_internal', to address the PRs under
>> discussion.  I'm testing such a patch.
>
> Oops!  Sorry about that, and thanks for tracking it down.

No worries.  ;-) OK to push the attached
"rs6000: Properly default-disable late-combine passes [PR106594, PR115622, 
PR115633]"?


Grüße
 Thomas


>From ccd12107fb06017f878384d2186ed5f01a1dab79 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 25 Jun 2024 10:55:41 +0200
Subject: [PATCH] rs6000: Properly default-disable late-combine passes
 [PR106594, PR115622, PR115633]

..., so that it also works for '__attribute__ ((optimize("[...]")))' etc.

	PR target/106594
	PR target/115622
	PR target/115633
	gcc/
	* config/rs6000/rs6000.cc (rs6000_option_override_internal): Move
	default-disable of late-combine passes from here...
	(rs6000_override_options_after_change): ... to here.
---
 gcc/config/rs6000/rs6000.cc | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index f39b8909925..713fac75f26 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3431,6 +3431,14 @@ rs6000_override_options_after_change (void)
   /* If we are inserting ROP-protect instructions, disable shrink wrap.  */
   if (rs6000_rop_protect)
 flag_shrink_wrap = 0;
+
+  /* One of the late-combine passes runs after register allocation
+ and can match define_insn_and_splits that were previously used
+ only before register allocation.  Some of those define_insn_and_splits
+ use gen_reg_rtx unconditionally.  Disable late-combine by default
+ until the define_insn_and_splits are fixed.  */
+  if (!OPTION_SET_P (flag_late_combine_instructions))
+flag_late_combine_instructions = 0;
 }
 
 #ifdef TARGET_USES_LINUX64_OPT
@@ -4768,14 +4776,6 @@ rs6000_option_override_internal (bool global_init_p)
 	targetm.expand_builtin_va_start = NULL;
 }
 
-  /* One of the late-combine passes runs after register allocation
- and can match define_insn_and_splits that were previously used
- only before register allocation.  Some of those define_insn_and_splits
- use gen_reg_rtx unconditionally.  Disable late-combine by default
- until the define_insn_and_splits are fixed.  */
-  if (!OPTION_SET_P (flag_late_combine_instructions))
-flag_late_combine_instructions = 0;
-
   rs6000_override_options_after_change ();
 
   /* If not explicitly specified via option, decide whether to generate indexed
-- 
2.34.1



Re: rs6000: Properly default-disable late-combine passes [PR106594, PR115622, PR115633]

2024-06-25 Thread Richard Sandiford
Thomas Schwinge  writes:
> Hi!
>
> On 2024-06-25T10:07:47+0100, Richard Sandiford  
> wrote:
>> Thomas Schwinge  writes:
>>> On 2024-06-20T14:34:18+0100, Richard Sandiford  
>>> wrote:
 This patch adds a combine pass that runs late in the pipeline.
 [...]
>>>
>>> Nice!
>>>
 The patch [...] disables the pass by default on i386, rs6000
 and xtensa.
>>>
>>> Like here:
>>>
 --- a/gcc/config/i386/i386-options.cc
 +++ b/gcc/config/i386/i386-options.cc
 @@ -1942,6 +1942,10 @@ ix86_override_options_after_change (void)
flag_cunroll_grow_size = flag_peel_loops || optimize >= 3;
  }
  
 +  /* Late combine tends to undo some of the effects of STV and RPAD,
 + by combining instructions back to their original form.  */
 +  if (!OPTION_SET_P (flag_late_combine_instructions))
 +flag_late_combine_instructions = 0;
  }
>>>
>>> ..., I think also here:
>>>
 --- a/gcc/config/rs6000/rs6000.cc
 +++ b/gcc/config/rs6000/rs6000.cc
 @@ -4768,6 +4768,14 @@ rs6000_option_override_internal (bool global_init_p)
targetm.expand_builtin_va_start = NULL;
  }
  
 +  /* One of the late-combine passes runs after register allocation
 + and can match define_insn_and_splits that were previously used
 + only before register allocation.  Some of those 
 define_insn_and_splits
 + use gen_reg_rtx unconditionally.  Disable late-combine by default
 + until the define_insn_and_splits are fixed.  */
 +  if (!OPTION_SET_P (flag_late_combine_instructions))
 +flag_late_combine_instructions = 0;
 +
rs6000_override_options_after_change ();
>>>
>>> ..., this needs to be done in 'rs6000_override_options_after_change'
>>> instead of 'rs6000_option_override_internal', to address the PRs under
>>> discussion.  I'm testing such a patch.
>>
>> Oops!  Sorry about that, and thanks for tracking it down.
>
> No worries.  ;-) OK to push the attached
> "rs6000: Properly default-disable late-combine passes [PR106594, PR115622, 
> PR115633]"?

Yes, thanks.

Richard

> Grüße
>  Thomas
>
>
> From ccd12107fb06017f878384d2186ed5f01a1dab79 Mon Sep 17 00:00:00 2001
> From: Thomas Schwinge 
> Date: Tue, 25 Jun 2024 10:55:41 +0200
> Subject: [PATCH] rs6000: Properly default-disable late-combine passes
>  [PR106594, PR115622, PR115633]
>
> ..., so that it also works for '__attribute__ ((optimize("[...]")))' etc.
>
>   PR target/106594
>   PR target/115622
>   PR target/115633
>   gcc/
>   * config/rs6000/rs6000.cc (rs6000_option_override_internal): Move
>   default-disable of late-combine passes from here...
>   (rs6000_override_options_after_change): ... to here.
> ---
>  gcc/config/rs6000/rs6000.cc | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index f39b8909925..713fac75f26 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -3431,6 +3431,14 @@ rs6000_override_options_after_change (void)
>/* If we are inserting ROP-protect instructions, disable shrink wrap.  */
>if (rs6000_rop_protect)
>  flag_shrink_wrap = 0;
> +
> +  /* One of the late-combine passes runs after register allocation
> + and can match define_insn_and_splits that were previously used
> + only before register allocation.  Some of those define_insn_and_splits
> + use gen_reg_rtx unconditionally.  Disable late-combine by default
> + until the define_insn_and_splits are fixed.  */
> +  if (!OPTION_SET_P (flag_late_combine_instructions))
> +flag_late_combine_instructions = 0;
>  }
>  
>  #ifdef TARGET_USES_LINUX64_OPT
> @@ -4768,14 +4776,6 @@ rs6000_option_override_internal (bool global_init_p)
>   targetm.expand_builtin_va_start = NULL;
>  }
>  
> -  /* One of the late-combine passes runs after register allocation
> - and can match define_insn_and_splits that were previously used
> - only before register allocation.  Some of those define_insn_and_splits
> - use gen_reg_rtx unconditionally.  Disable late-combine by default
> - until the define_insn_and_splits are fixed.  */
> -  if (!OPTION_SET_P (flag_late_combine_instructions))
> -flag_late_combine_instructions = 0;
> -
>rs6000_override_options_after_change ();
>  
>/* If not explicitly specified via option, decide whether to generate 
> indexed


Re: [PATCH 7/8] vect: Support multiple lane-reducing operations for loop reduction [PR114440]

2024-06-25 Thread Feng Xue OS
>>
>> >> -  if (slp_node)
>> >> +  if (slp_node && SLP_TREE_LANES (slp_node) > 1)
>> >
>> > Hmm, that looks wrong.  It looks like SLP_TREE_NUMBER_OF_VEC_STMTS is off
>> > instead, which is bad.
>> >
>> >> nvectors = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node);
>> >>else
>> >> nvectors = vect_get_num_copies (loop_vinfo, vectype_in);
>> >> @@ -7478,6 +7472,152 @@ vect_reduction_update_partial_vector_usage 
>> >> (loop_vec_info loop_vinfo,
>> >>  }
>> >>  }
>> >>
>> >> +/* Check if STMT_INFO is a lane-reducing operation that can be 
>> >> vectorized in
>> >> +   the context of LOOP_VINFO, and vector cost will be recorded in 
>> >> COST_VEC.
>> >> +   Now there are three such kinds of operations: dot-prod/widen-sum/sad
>> >> +   (sum-of-absolute-differences).
>> >> +
>> >> +   For a lane-reducing operation, the loop reduction path that it lies 
>> >> in,
>> >> +   may contain normal operation, or other lane-reducing operation of 
>> >> different
>> >> +   input type size, an example as:
>> >> +
>> >> + int sum = 0;
>> >> + for (i)
>> >> +   {
>> >> + ...
>> >> + sum += d0[i] * d1[i];   // dot-prod 
>> >> + sum += w[i];// widen-sum 
>> >> + sum += abs(s0[i] - s1[i]);  // sad 
>> >> + sum += n[i];// normal 
>> >> + ...
>> >> +   }
>> >> +
>> >> +   Vectorization factor is essentially determined by operation whose 
>> >> input
>> >> +   vectype has the most lanes ("vector(16) char" in the example), while 
>> >> we
>> >> +   need to choose input vectype with the least lanes ("vector(4) int" in 
>> >> the
>> >> +   example) for the reduction PHI statement.  */
>> >> +
>> >> +bool
>> >> +vectorizable_lane_reducing (loop_vec_info loop_vinfo, stmt_vec_info 
>> >> stmt_info,
>> >> +   slp_tree slp_node, stmt_vector_for_cost 
>> >> *cost_vec)
>> >> +{
>> >> +  gimple *stmt = stmt_info->stmt;
>> >> +
>> >> +  if (!lane_reducing_stmt_p (stmt))
>> >> +return false;
>> >> +
>> >> +  tree type = TREE_TYPE (gimple_assign_lhs (stmt));
>> >> +
>> >> +  if (!INTEGRAL_TYPE_P (type) && !SCALAR_FLOAT_TYPE_P (type))
>> >> +return false;
>> >> +
>> >> +  /* Do not try to vectorize bit-precision reductions.  */
>> >> +  if (!type_has_mode_precision_p (type))
>> >> +return false;
>> >> +
>> >> +  if (!slp_node)
>> >> +return false;
>> >> +
>> >> +  for (int i = 0; i < (int) gimple_num_ops (stmt) - 1; i++)
>> >> +{
>> >> +  stmt_vec_info def_stmt_info;
>> >> +  slp_tree slp_op;
>> >> +  tree op;
>> >> +  tree vectype;
>> >> +  enum vect_def_type dt;
>> >> +
>> >> +  if (!vect_is_simple_use (loop_vinfo, stmt_info, slp_node, i, &op,
>> >> +  &slp_op, &dt, &vectype, &def_stmt_info))
>> >> +   {
>> >> + if (dump_enabled_p ())
>> >> +   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>> >> +"use not simple.\n");
>> >> + return false;
>> >> +   }
>> >> +
>> >> +  if (!vectype)
>> >> +   {
>> >> + vectype = get_vectype_for_scalar_type (loop_vinfo, TREE_TYPE 
>> >> (op),
>> >> +slp_op);
>> >> + if (!vectype)
>> >> +   return false;
>> >> +   }
>> >> +
>> >> +  if (!vect_maybe_update_slp_op_vectype (slp_op, vectype))
>> >> +   {
>> >> + if (dump_enabled_p ())
>> >> +   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>> >> +"incompatible vector types for 
>> >> invariants\n");
>> >> + return false;
>> >> +   }
>> >> +
>> >> +  if (i == STMT_VINFO_REDUC_IDX (stmt_info))
>> >> +   continue;
>> >> +
>> >> +  /* There should be at most one cycle def in the stmt.  */
>> >> +  if (VECTORIZABLE_CYCLE_DEF (dt))
>> >> +   return false;
>> >> +}
>> >> +
>> >> +  stmt_vec_info reduc_info = STMT_VINFO_REDUC_DEF (vect_orig_stmt 
>> >> (stmt_info));
>> >> +
>> >> +  /* TODO: Support lane-reducing operation that does not directly 
>> >> participate
>> >> + in loop reduction. */
>> >> +  if (!reduc_info || STMT_VINFO_REDUC_IDX (stmt_info) < 0)
>> >> +return false;
>> >> +
>> >> +  /* Lane-reducing pattern inside any inner loop of LOOP_VINFO is not
>> >> + recoginized.  */
>> >> +  gcc_assert (STMT_VINFO_DEF_TYPE (reduc_info) == vect_reduction_def);
>> >> +  gcc_assert (STMT_VINFO_REDUC_TYPE (reduc_info) == TREE_CODE_REDUCTION);
>> >> +
>> >> +  tree vectype_in = STMT_VINFO_REDUC_VECTYPE_IN (stmt_info);
>> >> +  int ncopies_for_cost;
>> >> +
>> >> +  if (SLP_TREE_LANES (slp_node) > 1)
>> >> +{
>> >> +  /* Now lane-reducing operations in a non-single-lane slp node 
>> >> should only
>> >> +come from the same loop reduction path.  */
>> >> +  gcc_assert (REDUC_GROUP_FIRST_ELEMENT (stmt_info));
>> >> +  ncopies_for_cost = 1;
>> >> +}
>> >> +  else
>> >> +   

[PING][PATCH] rs6000: ROP - Emit hashst and hashchk insns on Power8 and later [PR114759]

2024-06-25 Thread Peter Bergner
Ping.[Message-ID: <1a420e3e-3285-4e0b-87bd-6714fedc0...@linux.ibm.com>]

Peter


On 6/19/24 4:14 PM, Peter Bergner wrote:
> We currently only emit the ROP-protect hash* insns for Power10, where the
> insns were added to the architecture.  We want to emit them for earlier
> cpus (where they operate as NOPs), so that if those older binaries are
> ever executed on a Power10, then they'll be protected from ROP attacks.
> Binutils accepts hashst and hashchk back to Power8, so change GCC to emit
> them for Power8 and later.  This matches clang's behavior.
> 
> This patch is independent of the ROP shrink-wrap fix submitted earlier.
> This passed bootstrap and regtesting on powerpc64le-linux with no regressions.
> Ok for trunk?  
> 
> Peter
> 
> 
> 
> 2024-06-19  Peter Bergner  
> 
> gcc/
>   PR target/114759
>   * config/rs6000/rs6000-logue.cc (rs6000_stack_info): Use TARGET_POWER8.
>   (rs6000_emit_prologue): Likewise.
>   * config/rs6000/rs6000.md (hashchk): Likewise.
>   (hashst): Likewise.
>   Fix whitespace.
> 
> gcc/testsuite/
>   PR target/114759
>   * gcc.target/powerpc/pr114759-2.c: New test.
>   * lib/target-supports.exp (rop_ok): Use
>   check_effective_target_has_arch_pwr8.
> ---
>  gcc/config/rs6000/rs6000-logue.cc |  6 +++---
>  gcc/config/rs6000/rs6000.md   |  6 +++---
>  gcc/testsuite/gcc.target/powerpc/pr114759-2.c | 17 +
>  gcc/testsuite/lib/target-supports.exp |  2 +-
>  4 files changed, 24 insertions(+), 7 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr114759-2.c
> 
> diff --git a/gcc/config/rs6000/rs6000-logue.cc 
> b/gcc/config/rs6000/rs6000-logue.cc
> index c384e48e378..bd363b625a4 100644
> --- a/gcc/config/rs6000/rs6000-logue.cc
> +++ b/gcc/config/rs6000/rs6000-logue.cc
> @@ -716,7 +716,7 @@ rs6000_stack_info (void)
>info->calls_p = (!crtl->is_leaf || cfun->machine->ra_needs_full_frame);
>info->rop_hash_size = 0;
>  
> -  if (TARGET_POWER10
> +  if (TARGET_POWER8
>&& info->calls_p
>&& DEFAULT_ABI == ABI_ELFv2
>&& rs6000_rop_protect)
> @@ -3277,7 +3277,7 @@ rs6000_emit_prologue (void)
>/* NOTE: The hashst isn't needed if we're going to do a sibcall,
>   but there's no way to know that here.  Harmless except for
>   performance, of course.  */
> -  if (TARGET_POWER10 && rs6000_rop_protect && info->rop_hash_size != 0)
> +  if (TARGET_POWER8 && rs6000_rop_protect && info->rop_hash_size != 0)
>  {
>gcc_assert (DEFAULT_ABI == ABI_ELFv2);
>rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM);
> @@ -5056,7 +5056,7 @@ rs6000_emit_epilogue (enum epilogue_type epilogue_type)
>  
>/* The ROP hash check must occur after the stack pointer is restored
>   (since the hash involves r1), and is not performed for a sibcall.  */
> -  if (TARGET_POWER10
> +  if (TARGET_POWER8
>&& rs6000_rop_protect
>&& info->rop_hash_size != 0
>&& epilogue_type != EPILOGUE_TYPE_SIBCALL)
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index a5d20594789..694076e311f 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -15808,9 +15808,9 @@ (define_insn "*cmpeqb_internal"
>  
>  (define_insn "hashst"
>[(set (match_operand:DI 0 "simple_offsettable_mem_operand" "=m")
> -(unspec_volatile:DI [(match_operand:DI 1 "int_reg_operand" "r")]
> + (unspec_volatile:DI [(match_operand:DI 1 "int_reg_operand" "r")]
>   UNSPEC_HASHST))]
> -  "TARGET_POWER10 && rs6000_rop_protect"
> +  "TARGET_POWER8 && rs6000_rop_protect"
>  {
>static char templ[32];
>const char *p = rs6000_privileged ? "p" : "";
> @@ -15823,7 +15823,7 @@ (define_insn "hashchk"
>[(unspec_volatile [(match_operand:DI 0 "int_reg_operand" "r")
>(match_operand:DI 1 "simple_offsettable_mem_operand" "m")]
>   UNSPEC_HASHCHK)]
> -  "TARGET_POWER10 && rs6000_rop_protect"
> +  "TARGET_POWER8 && rs6000_rop_protect"
>  {
>static char templ[32];
>const char *p = rs6000_privileged ? "p" : "";
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr114759-2.c 
> b/gcc/testsuite/gcc.target/powerpc/pr114759-2.c
> new file mode 100644
> index 000..3881ebd416e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr114759-2.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power8 -mrop-protect" } */
> +/* { dg-require-effective-target rop_ok } Only enable on supported ABIs.  */
> +
> +/* Verify we generate ROP-protect hash insns when compiling for Power8.  */
> +
> +extern void foo (void);
> +
> +int
> +bar (void)
> +{
> +  foo ();
> +  return 5;
> +}
> +
> +/* { dg-final { scan-assembler-times {\mhashst\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mhashchk\M} 1 } } */
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index e307f4e69ef.

[SPARC] Fix PR target/115608

2024-06-25 Thread Eric Botcazou
This passes -m32 when -mv8plus is specified on Linux (like on Solaris).

Applied to mainline and 14 branch.



2024-06-25  Eric Botcazou  

PR target/115608
* config/sparc/linux64.h (CC1_SPEC): Pass -m32 for -mv8plus.

-- 
Eric Botcazoudiff --git a/gcc/config/sparc/linux64.h b/gcc/config/sparc/linux64.h
index 1e2e4aef2ad..83e0d6874d9 100644
--- a/gcc/config/sparc/linux64.h
+++ b/gcc/config/sparc/linux64.h
@@ -162,7 +162,7 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
 "%{m32:%{m64:%emay not use both -m32 and -m64}} \
 %{m32:-mptr32 -mno-stack-bias %{!mlong-double-128:-mlong-double-64} \
   %{!mcpu*:-mcpu=cypress}} \
-%{mv8plus:-mptr32 -mno-stack-bias %{!mlong-double-128:-mlong-double-64} \
+%{mv8plus:-m32 -mptr32 -mno-stack-bias %{!mlong-double-128:-mlong-double-64} \
   %{!mcpu*:-mcpu=v9}} \
 %{!m32:%{!mcpu*:-mcpu=ultrasparc}} \
 %{!mno-vis:%{!m32:%{!mcpu=v9:-mvis}}}"


Re: [PATCH ver3] rs6000, altivec-1-runnable.c update the, require-effective-target

2024-06-25 Thread Kewen.Lin
Hi,

on 2024/6/25 03:00, Carl Love wrote:
> GCC maintainers:
> 
> version 3, rebased on current mainline tree.  Version 2 of the patch was out 
> of sync. Retested the patch on 
> Power 10 with no regressions.
> 
> version 2, update the dg options per the feedback.  Retested the patch on 
> Power 10 with no regressions.
> 
> This patch updates the dg options.
> 
> The patch has been tested on Power 10 with no regression failures.
> 
> Please let me know if this patch is acceptable for mainline.  Thanks.
> 
> Carl 
> 
> 
> 
> rs6000, altivec-1-runnable.c update the require-effective-target
> 
> Update the dg test directives.

OK with the very minor nit below tweaked, thanks!

> 
> gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog:
>   * gcc.target/powerpc/altivec-1-runnable.c: Change the
>   require-effective-target for the test.
> ---
>  gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c 
> b/gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c
> index 4e32860a169..6763ff3ff8b 100644
> --- a/gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c
> +++ b/gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c
> @@ -1,7 +1,9 @@
> -/* { dg-do compile { target powerpc*-*-* } } */
> -/* { dg-options "-maltivec" } */
> +/* { dg-do run { target vmx_hw } } */
> +/* { dg-do compile { target { ! vmx_hw } } } */
> +/* { dg-options "-O2 -maltivec" } */
>  /* { dg-require-effective-target powerpc_altivec } */
>  
> +

Nit: This newline is useless.

BR,
Kewen

>  #include 
>  
>  #ifdef DEBUG



x86_64-gnu-linux bootstrap fail (was: [PATCH v2 2/6] Extract ix86 dllimport implementation to mingw)

2024-06-25 Thread Tobias Burnus

Hi Evgeny,

I am not sure whether I have chosen the right email in the thread but:
a x86-64 GNU Linux build currently fails as follows.

At a glance, it seems to be sufficient to remove the prototype 
declaration in i386.cc.


Namely:

gcc/config/i386/i386.cc:107:12: error: 'rtx_def* 
legitimize_dllimport_symbol(rtx, bool)' declared 'static' but never 
defined [-Werror=unused-function]

  107 | static rtx legitimize_dllimport_symbol (rtx, bool);
  |^~~

gcc/gcc/config/i386/i386.cc:108:12: error: 'rtx_def* 
legitimize_pe_coff_extern_decl(rtx, bool)' declared 'static' but never 
defined [-Werror=unused-function]

  108 | static rtx legitimize_pe_coff_extern_decl (rtx, bool);
  |^~
^Cmake[3]: *** [Makefile:2556: i386.o] Interrupt

There is:

config/i386/i386.cc:static rtx legitimize_dllimport_symbol (rtx, bool);
config/mingw/winnt-dll.cc:legitimize_dllimport_symbol (rtx symbol, bool 
want_reg)
config/mingw/winnt-dll.cc:  return legitimize_dllimport_symbol 
(addr, inreg);
config/mingw/winnt-dll.cc:rtx t = legitimize_dllimport_symbol 
(XEXP (XEXP (addr, 0), 0), inreg);



And:

config/i386/i386.cc:static rtx legitimize_pe_coff_extern_decl (rtx, bool);
config/mingw/winnt-dll.cc:legitimize_pe_coff_extern_decl (rtx symbol, 
bool want_reg)
config/mingw/winnt-dll.cc:return legitimize_pe_coff_extern_decl 
(addr, inreg);
config/mingw/winnt-dll.cc:  rtx t = legitimize_pe_coff_extern_decl 
(XEXP (XEXP (addr, 0), 0), inreg);


Tobias


Re: [PATCH 1/3] Release structures on function return

2024-06-25 Thread Jan Hubicka
> The value vec objects are destroyed on exit, but release still needs to
> be called explicitly.
> 
> gcc/ChangeLog:
> 
>   * tree-profile.cc (find_conditions): Release vectors before
> return.
I wonder if you turn
hash_map, vec> exprs;
to
hash_map, auto_vec> exprs;
Won't hash_map destructor take care of this by itself?

Honza
> ---
>  gcc/tree-profile.cc | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
> index e4bb689cef5..18f48e8d04e 100644
> --- a/gcc/tree-profile.cc
> +++ b/gcc/tree-profile.cc
> @@ -919,6 +919,9 @@ find_conditions (struct function *fn)
>  if (!have_post_dom)
>   free_dominance_info (fn, CDI_POST_DOMINATORS);
>  
> +for (auto expr : exprs)
> +  expr.second.release ();
> +
>  cov->m_masks.safe_grow_cleared (2 * cov->m_index.last ());
>  const size_t length = cov_length (cov);
>  for (size_t i = 0; i != length; i++)
> -- 
> 2.39.2
> 


Re: [PATCH 2/3] Add section on MC/DC in gcov manual

2024-06-25 Thread Jan Hubicka
> gcc/ChangeLog:
> 
>   * doc/gcov.texi: Add MC/DC section.
OK,
thanks!
Honza
> ---
>  gcc/doc/gcov.texi | 72 +++
>  1 file changed, 72 insertions(+)
> 
> diff --git a/gcc/doc/gcov.texi b/gcc/doc/gcov.texi
> index dc79bccb8cf..a9221738cce 100644
> --- a/gcc/doc/gcov.texi
> +++ b/gcc/doc/gcov.texi
> @@ -917,6 +917,78 @@ of times the call was executed will be printed.  This 
> will usually be
>  100%, but may be less for functions that call @code{exit} or @code{longjmp},
>  and thus may not return every time they are called.
>  
> +When you use the @option{-g} option, your output looks like this:
> +
> +@smallexample
> +$ gcov -t -m -g tmp
> +-:0:Source:tmp.cpp
> +-:0:Graph:tmp.gcno
> +-:0:Data:tmp.gcda
> +-:0:Runs:1
> +-:1:#include 
> +-:2:
> +-:3:int
> +1:4:main (void)
> +-:5:@{
> +-:6:  int i, total;
> +1:7:  total = 0;
> +-:8:
> +   11:9:  for (i = 0; i < 10; i++)
> +condition outcomes covered 2/2
> +   10:   10:total += i;
> +-:   11:
> +   1*:   12:  int v = total > 100 ? 1 : 2;
> +condition outcomes covered 1/2
> +condition  0 not covered (true)
> +-:   13:
> +   1*:   14:  if (total != 45 && v == 1)
> +condition outcomes covered 1/4
> +condition  0 not covered (true)
> +condition  1 not covered (true false)
> +#:   15:printf ("Failure\n");
> +-:   16:  else
> +1:   17:printf ("Success\n");
> +1:   18:  return 0;
> +-:   19:@}
> +@end smallexample
> +
> +For every condition the number of taken and total outcomes are
> +printed, and if there are uncovered outcomes a line will be printed
> +for each condition showing the uncovered outcome in parentheses.
> +Conditions are identified by their index -- index 0 is the left-most
> +condition.  In @code{a || (b && c)}, @var{a} is condition 0, @var{b}
> +condition 1, and @var{c} condition 2.
> +
> +An outcome is considered covered if it has an independent effect on
> +the decision, also known as masking MC/DC (Modified Condition/Decision
> +Coverage).  In this example the decision evaluates to true and @var{a}
> +is evaluated, but not covered.  This is because @var{a} cannot affect
> +the decision independently -- both @var{a} and @var{b} must change
> +value for the decision to change.
> +
> +@smallexample
> +$ gcov -t -m -g tmp
> +-:0:Source:tmp.c
> +-:0:Graph:tmp.gcno
> +-:0:Data:tmp.gcda
> +-:0:Runs:1
> +-:1:#include 
> +-:2:
> +1:3:int main()
> +-:4:@{
> +1:5:  int a = 1;
> +1:6:  int b = 0;
> +-:7:
> +1:8:  if (a && b)
> +condition outcomes covered 1/4
> +condition  0 not covered (true false)
> +condition  1 not covered (true)
> +#:9:printf ("Success!\n");
> +-:   10:  else
> +1:   11:printf ("Failure!\n");
> +-:   12:@}
> +@end smallexample
> +
>  The execution counts are cumulative.  If the example program were
>  executed again without removing the @file{.gcda} file, the count for the
>  number of times each line in the source was executed would be added to
> -- 
> 2.39.2
> 


Re: [PATCH 3/3] Use the term MC/DC in help for gcov --conditions

2024-06-25 Thread Jan Hubicka
> Without key terms like "masking" and "MC/DC" it is not at all obvious
> what --conditions actually reports on, and there is no easy path for the
> user to figure out. By at least including the two key terms MC/DC and
> masking users have something to search for.
> 
> gcc/ChangeLog:
> 
> * gcov.cc (print_usage): Reference masking MC/DC.

So the main purpose is to turn users into masking MC/DC description in
the manual?  Asking google does not seem to do the trick so far, but
I don't know if better options.

OK,
Thanks
> ---
>  gcc/gcov.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/gcov.cc b/gcc/gcov.cc
> index f6787f0be8f..1e2e193d79d 100644
> --- a/gcc/gcov.cc
> +++ b/gcc/gcov.cc
> @@ -1015,7 +1015,7 @@ print_usage (int error_p)
>fnotice (file, "  -c, --branch-counts Output counts of 
> branches taken\n\
>  rather than percentages\n");
>fnotice (file, "  -g, --conditionsInclude modified 
> condition/decision\n\
> -coverage in output\n");
> +coverage (masking MC/DC) in output\n");
>fnotice (file, "  -d, --display-progress  Display progress 
> information\n");
>fnotice (file, "  -D, --debug  Display debugging 
> dumps\n");
>fnotice (file, "  -f, --function-summariesOutput summaries for 
> each function\n");
> -- 
> 2.39.2
> 


Re: [PATCH 7/8] vect: Support multiple lane-reducing operations for loop reduction [PR114440]

2024-06-25 Thread Richard Biener
On Tue, Jun 25, 2024 at 11:32 AM Feng Xue OS
 wrote:
>
> >>
> >> >> -  if (slp_node)
> >> >> +  if (slp_node && SLP_TREE_LANES (slp_node) > 1)
> >> >
> >> > Hmm, that looks wrong.  It looks like SLP_TREE_NUMBER_OF_VEC_STMTS is off
> >> > instead, which is bad.
> >> >
> >> >> nvectors = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node);
> >> >>else
> >> >> nvectors = vect_get_num_copies (loop_vinfo, vectype_in);
> >> >> @@ -7478,6 +7472,152 @@ vect_reduction_update_partial_vector_usage 
> >> >> (loop_vec_info loop_vinfo,
> >> >>  }
> >> >>  }
> >> >>
> >> >> +/* Check if STMT_INFO is a lane-reducing operation that can be 
> >> >> vectorized in
> >> >> +   the context of LOOP_VINFO, and vector cost will be recorded in 
> >> >> COST_VEC.
> >> >> +   Now there are three such kinds of operations: dot-prod/widen-sum/sad
> >> >> +   (sum-of-absolute-differences).
> >> >> +
> >> >> +   For a lane-reducing operation, the loop reduction path that it lies 
> >> >> in,
> >> >> +   may contain normal operation, or other lane-reducing operation of 
> >> >> different
> >> >> +   input type size, an example as:
> >> >> +
> >> >> + int sum = 0;
> >> >> + for (i)
> >> >> +   {
> >> >> + ...
> >> >> + sum += d0[i] * d1[i];   // dot-prod 
> >> >> + sum += w[i];// widen-sum 
> >> >> + sum += abs(s0[i] - s1[i]);  // sad 
> >> >> + sum += n[i];// normal 
> >> >> + ...
> >> >> +   }
> >> >> +
> >> >> +   Vectorization factor is essentially determined by operation whose 
> >> >> input
> >> >> +   vectype has the most lanes ("vector(16) char" in the example), 
> >> >> while we
> >> >> +   need to choose input vectype with the least lanes ("vector(4) int" 
> >> >> in the
> >> >> +   example) for the reduction PHI statement.  */
> >> >> +
> >> >> +bool
> >> >> +vectorizable_lane_reducing (loop_vec_info loop_vinfo, stmt_vec_info 
> >> >> stmt_info,
> >> >> +   slp_tree slp_node, stmt_vector_for_cost 
> >> >> *cost_vec)
> >> >> +{
> >> >> +  gimple *stmt = stmt_info->stmt;
> >> >> +
> >> >> +  if (!lane_reducing_stmt_p (stmt))
> >> >> +return false;
> >> >> +
> >> >> +  tree type = TREE_TYPE (gimple_assign_lhs (stmt));
> >> >> +
> >> >> +  if (!INTEGRAL_TYPE_P (type) && !SCALAR_FLOAT_TYPE_P (type))
> >> >> +return false;
> >> >> +
> >> >> +  /* Do not try to vectorize bit-precision reductions.  */
> >> >> +  if (!type_has_mode_precision_p (type))
> >> >> +return false;
> >> >> +
> >> >> +  if (!slp_node)
> >> >> +return false;
> >> >> +
> >> >> +  for (int i = 0; i < (int) gimple_num_ops (stmt) - 1; i++)
> >> >> +{
> >> >> +  stmt_vec_info def_stmt_info;
> >> >> +  slp_tree slp_op;
> >> >> +  tree op;
> >> >> +  tree vectype;
> >> >> +  enum vect_def_type dt;
> >> >> +
> >> >> +  if (!vect_is_simple_use (loop_vinfo, stmt_info, slp_node, i, &op,
> >> >> +  &slp_op, &dt, &vectype, &def_stmt_info))
> >> >> +   {
> >> >> + if (dump_enabled_p ())
> >> >> +   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> >> >> +"use not simple.\n");
> >> >> + return false;
> >> >> +   }
> >> >> +
> >> >> +  if (!vectype)
> >> >> +   {
> >> >> + vectype = get_vectype_for_scalar_type (loop_vinfo, TREE_TYPE 
> >> >> (op),
> >> >> +slp_op);
> >> >> + if (!vectype)
> >> >> +   return false;
> >> >> +   }
> >> >> +
> >> >> +  if (!vect_maybe_update_slp_op_vectype (slp_op, vectype))
> >> >> +   {
> >> >> + if (dump_enabled_p ())
> >> >> +   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> >> >> +"incompatible vector types for 
> >> >> invariants\n");
> >> >> + return false;
> >> >> +   }
> >> >> +
> >> >> +  if (i == STMT_VINFO_REDUC_IDX (stmt_info))
> >> >> +   continue;
> >> >> +
> >> >> +  /* There should be at most one cycle def in the stmt.  */
> >> >> +  if (VECTORIZABLE_CYCLE_DEF (dt))
> >> >> +   return false;
> >> >> +}
> >> >> +
> >> >> +  stmt_vec_info reduc_info = STMT_VINFO_REDUC_DEF (vect_orig_stmt 
> >> >> (stmt_info));
> >> >> +
> >> >> +  /* TODO: Support lane-reducing operation that does not directly 
> >> >> participate
> >> >> + in loop reduction. */
> >> >> +  if (!reduc_info || STMT_VINFO_REDUC_IDX (stmt_info) < 0)
> >> >> +return false;
> >> >> +
> >> >> +  /* Lane-reducing pattern inside any inner loop of LOOP_VINFO is not
> >> >> + recoginized.  */
> >> >> +  gcc_assert (STMT_VINFO_DEF_TYPE (reduc_info) == vect_reduction_def);
> >> >> +  gcc_assert (STMT_VINFO_REDUC_TYPE (reduc_info) == 
> >> >> TREE_CODE_REDUCTION);
> >> >> +
> >> >> +  tree vectype_in = STMT_VINFO_REDUC_VECTYPE_IN (stmt_info);
> >> >> +  int ncopies_for_cost;
> >> >> +
> >> >> +  if (SLP_TREE

Re: [PATCH] Add a late-combine pass [PR106594]

2024-06-25 Thread Richard Biener
On Mon, Jun 24, 2024 at 9:38 PM Segher Boessenkool
 wrote:
>
> I didn't see this before.   Sigh.
>
> On Tue, Jan 02, 2024 at 09:47:11AM +, Richard Sandiford wrote:
> > Segher Boessenkool  writes:
> > > On Tue, Oct 24, 2023 at 07:49:10PM +0100, Richard Sandiford wrote:
> > >> This patch adds a combine pass that runs late in the pipeline.
> > >
> > > But it is not.  It is a completely new thing, and much closer to
> > > fwprop than to combine, too.
> >
> > Well, it is a combine pass.
>
> No, it is not.  In the context of GCC combine is the instruction
> combiner.  Which does something else than this does.
>
> So use a different name.  Please.  It will be NAKked by the combine
> maintainer otherwise.
>
> > It's not a new instance of the pass in
> > combine.cc, but I don't think that's the implication.  We already have
> > two combine passes: the combine.cc one and the postreload one.
>
> There is no postreload-combine pass.  There is a postreload pass that
> does various trivial things.  One of those is reload_combine, which is
> nothing like combine.  It is a kind of limited fwprop for memory
> addressing.
>
> > > Could you rename it to something else, please?  Something less confusing
> > > to both users and maintainers :-)
> >
> > Do you have any suggestions?
>
> Since it is something like fwprop, maybe something like that?  Or maybe
> put "addressing" in the name, if that is the point here.
>
> > >> The pass currently has a single objective: remove definitions by
> > >> substituting into all uses.
> > >
> > > The easy case ;-)
> >
> > And the yet a case that no existing pass handles. :)  That's why I'm
> > trying to add something that does.
>
> So, fwprop.

fullprop

Richard.

>
> Segher


Re: [PATCH 1/3] Release structures on function return

2024-06-25 Thread Jørgen Kvalsvik

On 6/25/24 12:23, Jan Hubicka wrote:

The value vec objects are destroyed on exit, but release still needs to
be called explicitly.

gcc/ChangeLog:

* tree-profile.cc (find_conditions): Release vectors before
  return.

I wonder if you turn
 hash_map, vec> exprs;
to
 hash_map, auto_vec> exprs;
Won't hash_map destructor take care of this by itself?


It does, actually - I think I tried something to that effect at one 
point and the auto_vec's non-copy semantics got in the way. Apparently I 
either misremember trying, or I did something differently at the time. 
auto_vec is much nicer, of course, I'll update the patch. Thanks!




Honza

---
  gcc/tree-profile.cc | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
index e4bb689cef5..18f48e8d04e 100644
--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -919,6 +919,9 @@ find_conditions (struct function *fn)
  if (!have_post_dom)
free_dominance_info (fn, CDI_POST_DOMINATORS);
  
+for (auto expr : exprs)

+  expr.second.release ();
+
  cov->m_masks.safe_grow_cleared (2 * cov->m_index.last ());
  const size_t length = cov_length (cov);
  for (size_t i = 0; i != length; i++)
--
2.39.2





Re: [PATCH 3/3] Use the term MC/DC in help for gcov --conditions

2024-06-25 Thread Jørgen Kvalsvik

On 6/25/24 12:25, Jan Hubicka wrote:

Without key terms like "masking" and "MC/DC" it is not at all obvious
what --conditions actually reports on, and there is no easy path for the
user to figure out. By at least including the two key terms MC/DC and
masking users have something to search for.

gcc/ChangeLog:

 * gcov.cc (print_usage): Reference masking MC/DC.


So the main purpose is to turn users into masking MC/DC description in
the manual?  Asking google does not seem to do the trick so far, but
I don't know if better options.


Either in the manual (which has a very brief example), or into google to 
find random blog posts or papers like An Investigation of Three Forms of 
the Modified Condition Decision Coverage (MCDC) Criterion by Chilenski 
(2001) and 
https://ntrs.nasa.gov/api/citations/20040086014/downloads/20040086014.pdf 
(a tutorial on MC/DC by Hayhurst et al). Granted, the web is a bit 
sparse on anything beyond trivial (and usually quietly unique cause) 
MC/DC examples, but I'm not sure if the gcc/gcov manual is the home for it.


Thanks,
Jørgen



OK,
Thanks

---
  gcc/gcov.cc | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/gcov.cc b/gcc/gcov.cc
index f6787f0be8f..1e2e193d79d 100644
--- a/gcc/gcov.cc
+++ b/gcc/gcov.cc
@@ -1015,7 +1015,7 @@ print_usage (int error_p)
fnotice (file, "  -c, --branch-counts Output counts of branches 
taken\n\
  rather than percentages\n");
fnotice (file, "  -g, --conditionsInclude modified 
condition/decision\n\
-coverage in output\n");
+coverage (masking MC/DC) in output\n");
fnotice (file, "  -d, --display-progress  Display progress 
information\n");
fnotice (file, "  -D, --debug  Display debugging 
dumps\n");
fnotice (file, "  -f, --function-summariesOutput summaries for each 
function\n");
--
2.39.2





x86_64-gnu-linux bootstrap fail (was: [PATCH v2 2/6] Extract ix86 dllimport implementation to mingw)

2024-06-25 Thread Evgeny Karpov
Tuesday, June 25, 2024 12:03 PM
Tobias Burnus  wrote:
> 
> Hi Evgeny,
> 
> I am not sure whether I have chosen the right email in the thread but:
> a x86-64 GNU Linux build currently fails as follows.
> 
> At a glance, it seems to be sufficient to remove the prototype
> declaration in i386.cc.
> 
> Namely:
> 
> gcc/config/i386/i386.cc:107:12: error: 'rtx_def*
> legitimize_dllimport_symbol(rtx, bool)' declared 'static' but never
> defined [-Werror=unused-function]
>107 | static rtx legitimize_dllimport_symbol (rtx, bool);
>|^~~
> 
> gcc/gcc/config/i386/i386.cc:108:12: error: 'rtx_def*
> legitimize_pe_coff_extern_decl(rtx, bool)' declared 'static' but never
> defined [-Werror=unused-function]
>108 | static rtx legitimize_pe_coff_extern_decl (rtx, bool);
>|^~
> ^Cmake[3]: *** [Makefile:2556: i386.o] Interrupt
> 
> There is:
> 
> config/i386/i386.cc:static rtx legitimize_dllimport_symbol (rtx, bool);
> config/mingw/winnt-dll.cc:legitimize_dllimport_symbol (rtx symbol, bool
> want_reg)
> config/mingw/winnt-dll.cc:  return legitimize_dllimport_symbol
> (addr, inreg);
> config/mingw/winnt-dll.cc:rtx t = legitimize_dllimport_symbol
> (XEXP (XEXP (addr, 0), 0), inreg);
> 
> 
> And:
> 
> config/i386/i386.cc:static rtx legitimize_pe_coff_extern_decl (rtx, bool);
> config/mingw/winnt-dll.cc:legitimize_pe_coff_extern_decl (rtx symbol,
> bool want_reg)
> config/mingw/winnt-dll.cc:return legitimize_pe_coff_extern_decl
> (addr, inreg);
> config/mingw/winnt-dll.cc:  rtx t = legitimize_pe_coff_extern_decl
> (XEXP (XEXP (addr, 0), 0), inreg);
> 
> Tobias

Thank you, Tobias, for reporting the problem.

x86_64-gnu-linux build has been built however it looks like it is missing a 
check for unused functions.
The fix will be prepared, tested and submitted to the mailing list today.

Regards,
Evgeny


Re: [PATCH][v2] Support single def-use cycle optimization for SLP reduction vectorization

2024-06-25 Thread Thomas Schwinge
Hi!

On 2024-06-14T11:08:15+0200, Richard Biener  wrote:
> We can at least mimic single def-use cycle optimization when doing
> single-lane SLP reductions and that's required to avoid regressing
> compared to non-SLP.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
>
>   * tree-vect-loop.cc (vectorizable_reduction): Allow
>   single-def-use cycles with SLP.
>   (vect_transform_reduction): Handle SLP single def-use cycles.
>   (vect_transform_cycle_phi): Likewise.
>
>   * gcc.dg/vect/slp-reduc-12.c: New testcase.

For GCN target (tested '-march=gfx908' on current sources), I see:

+PASS: gcc.dg/vect/slp-reduc-12.c (test for excess errors)
+FAIL: gcc.dg/vect/slp-reduc-12.c scan-tree-dump vect "using single def-use 
cycle for reduction"

..., where we've got (see attached):

[...]
[...]/gcc.dg/vect/slp-reduc-12.c:10:21: optimized: loop vectorized using 
256 byte vectors
[...]
[...]/gcc.dg/vect/slp-reduc-12.c:10:21: note:   Reduce using direct vector 
reduction.
[...]/gcc.dg/vect/slp-reduc-12.c:10:21: note:   vectorizing stmts using SLP.
[...]

How to address?


Grüße
 Thomas


>  gcc/testsuite/gcc.dg/vect/slp-reduc-12.c | 18 ++
>  gcc/tree-vect-loop.cc| 45 ++--
>  2 files changed, 45 insertions(+), 18 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/vect/slp-reduc-12.c
>
> diff --git a/gcc/testsuite/gcc.dg/vect/slp-reduc-12.c 
> b/gcc/testsuite/gcc.dg/vect/slp-reduc-12.c
> new file mode 100644
> index 000..625f8097c54
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/slp-reduc-12.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target vect_double } */
> +/* { dg-require-effective-target vect_int } */
> +/* { dg-require-effective-target vect_hw_misalign } */
> +/* { dg-additional-options "-Ofast" } */
> +
> +double foo (double *x, int * __restrict a, int n)
> +{
> +  double r = 0.;
> +  for (int i = 0; i < n; ++i)
> +{
> +  a[i] = a[i] + i;
> +  r += x[i];
> +}
> +  return r;
> +}
> +
> +/* { dg-final { scan-tree-dump "using single def-use cycle for reduction" 
> "vect" } } */
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index bbd5d261907..d9a2ad69484 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -8320,7 +8320,11 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
> participating.  When unrolling we want each unrolled iteration to have its
> own reduction accumulator since one of the main goals of unrolling a
> reduction is to reduce the aggregate loop-carried latency.  */
> -  if (ncopies > 1
> +  if ((ncopies > 1
> +   || (slp_node
> +&& !REDUC_GROUP_FIRST_ELEMENT (stmt_info)
> +&& SLP_TREE_LANES (slp_node) == 1
> +&& vect_get_num_copies (loop_vinfo, vectype_in) > 1))
>&& (STMT_VINFO_RELEVANT (stmt_info) <= vect_used_only_live)
>&& reduc_chain_length == 1
>&& loop_vinfo->suggested_unroll_factor == 1)
> @@ -8373,6 +8377,10 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
>   single_defuse_cycle = false;
>   }
>  }
> +  if (dump_enabled_p () && single_defuse_cycle)
> +dump_printf_loc (MSG_NOTE, vect_location,
> +  "using single def-use cycle for reduction by reducing "
> +  "multiple vectors to one in the loop body\n");
>STMT_VINFO_FORCE_SINGLE_CYCLE (reduc_info) = single_defuse_cycle;
>  
>/* If the reduction stmt is one of the patterns that have lane
> @@ -8528,9 +8536,8 @@ vect_transform_reduction (loop_vec_info loop_vinfo,
>  {
>tree vectype_out = STMT_VINFO_VECTYPE (stmt_info);
>class loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
> -  int i;
> -  int ncopies;
> -  int vec_num;
> +  unsigned ncopies;
> +  unsigned vec_num;
>  
>stmt_vec_info reduc_info = info_for_reduction (loop_vinfo, stmt_info);
>gcc_assert (reduc_info->is_reduc_info);
> @@ -8577,7 +8584,6 @@ vect_transform_reduction (loop_vec_info loop_vinfo,
>auto_vec vec_oprnds0;
>auto_vec vec_oprnds1;
>auto_vec vec_oprnds2;
> -  tree def0;
>  
>if (dump_enabled_p ())
>  dump_printf_loc (MSG_NOTE, vect_location, "transform reduction.\n");
> @@ -8652,20 +8658,21 @@ vect_transform_reduction (loop_vec_info loop_vinfo,
>   definition.  */
>if (single_defuse_cycle)
>  {
> -  gcc_assert (!slp_node);
> -  vect_get_vec_defs_for_operand (loop_vinfo, stmt_info, 1,
> -  op.ops[reduc_index],
> -  reduc_index == 0 ? &vec_oprnds0
> -  : (reduc_index == 1 ? &vec_oprnds1
> - : &vec_oprnds2));
> +  vect_get_vec_defs (loop_vinfo, stmt_info, slp_node, 1,
> +  reduc_index == 0 ? op.ops[0] : NULL_TREE, &vec_oprnds0,
> +  reduc_index == 1 ? op.ops[1] : NULL_TREE, &vec_oprnds1,
> +  redu

Re: [PATCH] c++: ICE with __has_unique_object_representations [PR115476]

2024-06-25 Thread Jonathan Wakely
On Tue, 25 Jun 2024 at 03:12, Jason Merrill  wrote:
>
> On 6/18/24 10:31, Marek Polacek wrote:
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/14/13?
>
> Makes sense to me, though probably the [meta.unary.prop] table should be
> adjusted in the same way.  Jonathan, what do you think?

Just to make sure I understand correctly, the suggestion is to change
the precondition for the trait to something like:

"remove_all_extents_t shall be a complete type or cv void."

i.e. if T is incomplete then T[] cannot be used with the trait, right?

>
> > -- >8 --
> > Here we started to ICE with r13-25: in check_trait_type, for "X[]" we
> > return true here:
> >
> >if (kind == 1 && TREE_CODE (type) == ARRAY_TYPE && !TYPE_DOMAIN (type))
> >  return true; // Array of unknown bound. Don't care about completeness.
> >
> > and then end up crashing in record_has_unique_obj_representations:
> >
> > 4836if (cur != wi::to_offset (sz))
> >
> > because sz is null.
> >
> > https://eel.is/c++draft/type.traits#tab:meta.unary.prop-row-47-column-3-sentence-1
> > says that the preconditions for __has_unique_object_representations are:
> > "T shall be a complete type, cv void, or an array of unknown bound" and
> > that "For an array type T, the same result as
> > has_unique_object_representations_v>" so T[]
> > should be treated as T.  So we should use kind==2 for the trait.
> >
> >   PR c++/115476
> >
> > gcc/cp/ChangeLog:
> >
> >   * semantics.cc (finish_trait_expr)
> >   : Move below to call
> >   check_trait_type with kind==2.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * g++.dg/cpp1z/has-unique-obj-representations4.C: New test.
> > ---
> >   gcc/cp/semantics.cc  |  2 +-
> >   .../cpp1z/has-unique-obj-representations4.C  | 16 
> >   2 files changed, 17 insertions(+), 1 deletion(-)
> >   create mode 100644 
> > gcc/testsuite/g++.dg/cpp1z/has-unique-obj-representations4.C
> >
> > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> > index 08f5f245e7d..42251b6764b 100644
> > --- a/gcc/cp/semantics.cc
> > +++ b/gcc/cp/semantics.cc
> > @@ -12966,7 +12966,6 @@ finish_trait_expr (location_t loc, cp_trait_kind 
> > kind, tree type1, tree type2)
> >   case CPTK_HAS_NOTHROW_COPY:
> >   case CPTK_HAS_TRIVIAL_COPY:
> >   case CPTK_HAS_TRIVIAL_DESTRUCTOR:
> > -case CPTK_HAS_UNIQUE_OBJ_REPRESENTATIONS:
> > if (!check_trait_type (type1))
> >   return error_mark_node;
> > break;
> > @@ -12976,6 +12975,7 @@ finish_trait_expr (location_t loc, cp_trait_kind 
> > kind, tree type1, tree type2)
> >   case CPTK_IS_STD_LAYOUT:
> >   case CPTK_IS_TRIVIAL:
> >   case CPTK_IS_TRIVIALLY_COPYABLE:
> > +case CPTK_HAS_UNIQUE_OBJ_REPRESENTATIONS:
> > if (!check_trait_type (type1, /* kind = */ 2))
> >   return error_mark_node;
> > break;
> > diff --git a/gcc/testsuite/g++.dg/cpp1z/has-unique-obj-representations4.C 
> > b/gcc/testsuite/g++.dg/cpp1z/has-unique-obj-representations4.C
> > new file mode 100644
> > index 000..d6949dc7005
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/cpp1z/has-unique-obj-representations4.C
> > @@ -0,0 +1,16 @@
> > +// PR c++/115476
> > +// { dg-do compile { target c++11 } }
> > +
> > +struct X;
> > +static_assert(__has_unique_object_representations(X), ""); // { 
> > dg-error "invalid use of incomplete type" }
> > +static_assert(__has_unique_object_representations(X[]), "");  // { 
> > dg-error "invalid use of incomplete type" }
> > +static_assert(__has_unique_object_representations(X[1]), "");  // { 
> > dg-error "invalid use of incomplete type" }
> > +static_assert(__has_unique_object_representations(X[][1]), "");  // { 
> > dg-error "invalid use of incomplete type" }
> > +
> > +struct X {
> > +  int x;
> > +};
> > +static_assert(__has_unique_object_representations(X), "");
> > +static_assert(__has_unique_object_representations(X[]), "");
> > +static_assert(__has_unique_object_representations(X[1]), "");
> > +static_assert(__has_unique_object_representations(X[][1]), "");
> >
> > base-commit: 7f9be55a4630134a237219af9cc8143e02080380
>



[PATCH] arm: make arm_predict_doloop_p reject loops with calls

2024-06-25 Thread Andre Vieira (lists)

Hi,

With the introduction of low overhead loops in 
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=3dfc28dbbd21b1d708aa40064380ef4c42c994d7 
we defined arm_predict_doloop_p, this is meant to be a low-weight check 
to rule out loops we are not considering for doloop optimization and it 
is used by other passes to prevent optimizations that may hurt the 
doloop optimization later on. The reason these are meant to be 
lightweight is because it's used by pre-RTL optimizations, meaning we 
can't do the same checks that doloop does.


After the definition of arm_predict_doloop_p, when testing for 
armv8.1-m.main, tree-ssa/ivopts-3.c failed the scan-dump check as the 
dump now matched an extra '!= 0' introduced by:

Doloop cmp iv use: if (ivtmp_1 != 0)
Predict loop 1 can perform doloop optimization later.

where previously we had:
Predict doloop failure due to target specific checks.

and after this patch:
Predict doloop failure due to call in loop.
Predict doloop failure due to target specific checks.

Added a copy of the original tree-ssa/ivopts-3.c as a target specifc 
test to check for the new dump message.


Ran a regression test for arm-none-eabi with 
-march=armv8.1-m.main+mve/-mfpu=auto/-mthumb/-mfloat-abi=hard.


OK for trunk?

gcc/ChangeLog:

* confir/arm/arm.cc (arm_predict_doloop_p): Reject loops with 
function calls that are not builtins.


gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/ivopts-3.c: New test.diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
index 
7d67d2cfee9f4edc91f187e940be40c07ff726cd..6dab65f493beb76089f80966a73a46afe037e6f9
 100644
--- a/gcc/config/arm/arm.cc
+++ b/gcc/config/arm/arm.cc
@@ -35587,6 +35587,22 @@ arm_predict_doloop_p (struct loop *loop)
" loop bb complexity.\n");
   return false;
 }
+  else
+{
+  gimple_stmt_iterator gsi = gsi_after_labels (loop->header);
+  while (!gsi_end_p (gsi))
+   {
+ if (is_gimple_call (gsi_stmt (gsi))
+ && !gimple_call_builtin_p (gsi_stmt (gsi)))
+   {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file, "Predict doloop failure due to"
+   " call in loop.\n");
+ return false;
+   }
+ gsi_next (&gsi);
+   }
+}
 
   return true;
 }
diff --git a/gcc/testsuite/gcc.target/arm/mve/ivopts-3.c 
b/gcc/testsuite/gcc.target/arm/mve/ivopts-3.c
new file mode 100644
index 
..19b2442ef12cbf13d51761ae93c7c81bb5bc07c4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/ivopts-3.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ivopts-details" } */
+
+void f2 (void);
+
+int main (void)
+{
+  int i;
+  for (i = 0; i < 10; i++)
+f2 ();
+}
+
+/* { dg-final { scan-tree-dump "Predict doloop failure due to call in loop." 
"ivopts" } } */


[pushed] Add a debug counter for late-combine

2024-06-25 Thread Richard Sandiford
This should help to diagnose problems like PR115631.

Bootstrapped & regression-tested on aarch64-linux-gnu, pushed as obvious.

Richard


gcc/
* dbgcnt.def (late_combine): New debug counter.
* late-combine.cc (insn_combination::run): Use it.
---
 gcc/dbgcnt.def  | 1 +
 gcc/late-combine.cc | 6 ++
 2 files changed, 7 insertions(+)

diff --git a/gcc/dbgcnt.def b/gcc/dbgcnt.def
index ed9f062eac2..e0b9b1b2a76 100644
--- a/gcc/dbgcnt.def
+++ b/gcc/dbgcnt.def
@@ -186,6 +186,7 @@ DEBUG_COUNTER (ipa_sra_params)
 DEBUG_COUNTER (ipa_sra_retvalues)
 DEBUG_COUNTER (ira_move)
 DEBUG_COUNTER (ivopts_loop)
+DEBUG_COUNTER (late_combine)
 DEBUG_COUNTER (lim)
 DEBUG_COUNTER (local_alloc_for_sched)
 DEBUG_COUNTER (loop_unswitch)
diff --git a/gcc/late-combine.cc b/gcc/late-combine.cc
index 22a1d81d38e..fc75d1c56d7 100644
--- a/gcc/late-combine.cc
+++ b/gcc/late-combine.cc
@@ -41,6 +41,7 @@
 #include "tree-pass.h"
 #include "cfgcleanup.h"
 #include "target.h"
+#include "dbgcnt.h"
 
 using namespace rtl_ssa;
 
@@ -428,6 +429,11 @@ insn_combination::run ()
   || !crtl->ssa->verify_insn_changes (m_nondebug_changes))
 return false;
 
+  // We've now decided that the optimization is valid and profitable.
+  // Allow it to be suppressed for bisection purposes.
+  if (!dbg_cnt (::late_combine))
+return false;
+
   substitute_optional_uses (m_def);
 
   confirm_change_group ();
-- 
2.25.1



[PATCH]middle-end: Implement conditonal store vectorizer pattern [PR115531]

2024-06-25 Thread Tamar Christina
Hi All,

This adds a conditional store optimization for the vectorizer as a pattern.
The vectorizer already supports modifying memory accesses because of the pattern
based gather/scatter recognition.

Doing it in the vectorizer allows us to still keep the ability to vectorize such
loops for architectures that don't have MASK_STORE support, whereas doing this
in ifcvt makes us commit to MASK_STORE.

Concretely for this loop:

void foo1 (char *restrict a, int *restrict b, int *restrict c, int n, int 
stride)
{
  if (stride <= 1)
return;

  for (int i = 0; i < n; i++)
{
  int res = c[i];
  int t = b[i+stride];
  if (a[i] != 0)
res = t;
  c[i] = res;
}
}

today we generate:

.L3:
ld1bz29.s, p7/z, [x0, x5]
ld1wz31.s, p7/z, [x2, x5, lsl 2]
ld1wz30.s, p7/z, [x1, x5, lsl 2]
cmpne   p15.b, p6/z, z29.b, #0
sel z30.s, p15, z30.s, z31.s
st1wz30.s, p7, [x2, x5, lsl 2]
add x5, x5, x4
whilelo p7.s, w5, w3
b.any   .L3

which in gimple is:

  vect_res_18.9_68 = .MASK_LOAD (vectp_c.7_65, 32B, loop_mask_67);
  vect_t_20.12_74 = .MASK_LOAD (vectp.10_72, 32B, loop_mask_67);
  vect__9.15_77 = .MASK_LOAD (vectp_a.13_75, 8B, loop_mask_67);
  mask__34.16_79 = vect__9.15_77 != { 0, ... };
  vect_res_11.17_80 = VEC_COND_EXPR ;
  .MASK_STORE (vectp_c.18_81, 32B, loop_mask_67, vect_res_11.17_80);

A MASK_STORE is already conditional, so there's no need to perform the load of
the old values and the VEC_COND_EXPR.  This patch makes it so we generate:

  vect_res_18.9_68 = .MASK_LOAD (vectp_c.7_65, 32B, loop_mask_67);
  vect__9.15_77 = .MASK_LOAD (vectp_a.13_75, 8B, loop_mask_67);
  mask__34.16_79 = vect__9.15_77 != { 0, ... };
  .MASK_STORE (vectp_c.18_81, 32B, mask__34.16_79, vect_res_18.9_68);

which generates:

.L3:
ld1bz30.s, p7/z, [x0, x5]
ld1wz31.s, p7/z, [x1, x5, lsl 2]
cmpne   p7.b, p7/z, z30.b, #0
st1wz31.s, p7, [x2, x5, lsl 2]
add x5, x5, x4
whilelo p7.s, w5, w3
b.any   .L3

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

PR tree-optimization/115531
* tree-vect-patterns.cc (vect_cond_store_pattern_same_ref): New.
(vect_recog_cond_store_pattern): New.
(vect_vect_recog_func_ptrs): Use it.

gcc/testsuite/ChangeLog:

PR tree-optimization/115531
* gcc.dg/vect/vect-conditional_store_1.c: New test.
* gcc.dg/vect/vect-conditional_store_2.c: New test.
* gcc.dg/vect/vect-conditional_store_3.c: New test.
* gcc.dg/vect/vect-conditional_store_4.c: New test.

---
diff --git a/gcc/testsuite/gcc.dg/vect/vect-conditional_store_1.c 
b/gcc/testsuite/gcc.dg/vect/vect-conditional_store_1.c
new file mode 100644
index 
..3884a3c3d0a2dc2258097348c75bb7c0b3b37c72
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-conditional_store_1.c
@@ -0,0 +1,24 @@
+/* { dg-do assemble } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_masked_store } */
+
+/* { dg-additional-options "-mavx2" { target avx2 } } */
+/* { dg-additional-options "-march=armv9-a" { target aarch64-*-* } } */
+
+void foo1 (char *restrict a, int *restrict b, int *restrict c, int n, int 
stride)
+{
+  if (stride <= 1)
+return;
+
+  for (int i = 0; i < n; i++)
+{
+  int res = c[i];
+  int t = b[i+stride];
+  if (a[i] != 0)
+res = t;
+  c[i] = res;
+}
+}
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+/* { dg-final { scan-tree-dump-not "VEC_COND_EXPR " "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-conditional_store_2.c 
b/gcc/testsuite/gcc.dg/vect/vect-conditional_store_2.c
new file mode 100644
index 
..bc965a244f147c199b1726e5f6b44229539cd225
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-conditional_store_2.c
@@ -0,0 +1,24 @@
+/* { dg-do assemble } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_masked_store } */
+
+/* { dg-additional-options "-mavx2" { target avx2 } } */
+/* { dg-additional-options "-march=armv9-a" { target aarch64-*-* } } */
+
+void foo2 (char *restrict a, int *restrict b, int *restrict c, int n, int 
stride)
+{
+  if (stride <= 1)
+return;
+
+  for (int i = 0; i < n; i++)
+{
+  int res = c[i];
+  int t = b[i+stride];
+  if (a[i] != 0)
+t = res;
+  c[i] = t;
+}
+}
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+/* { dg-final { scan-tree-dump-not "VEC_COND_EXPR " "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-conditional_store_3.c 
b/gcc/testsuite/gcc.dg/vect/vect-conditional_store_3.c
new file mode 100644
index 
..ab6889f967b330a652917925c2748b16af59b9fd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-co

Ping^4: [PATCH 0/2] Fix two test failures with --enable-default-pie [PR70150]

2024-06-25 Thread Xi Ruoyao
Ping^4 https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650763.html

On Mon, 2024-05-06 at 12:45 +0800, Xi Ruoyao wrote:
> In GCC 14.1-rc1, there are two new (comparing to GCC 13) failures if
> the build is configured --enable-default-pie.  Let's fix them.
> 
> Tested on x86_64-linux-gnu.  Ok for trunk and releases/gcc-14?
> 
> Xi Ruoyao (2):
>   i386: testsuite: Add -no-pie for pr113689-1.c [PR70150]
>   i386: testsuite: Adapt fentryname3.c for r14-811 change [PR70150]
> 
>  gcc/testsuite/gcc.target/i386/fentryname3.c | 3 +--
>  gcc/testsuite/gcc.target/i386/pr113689-1.c  | 2 +-
>  2 files changed, 2 insertions(+), 3 deletions(-)


-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


[PATCH] libstdc++: Replace viewcvs links in docs with cgit links

2024-06-25 Thread Jonathan Wakely
Gerald noticed these stale viewcvs links. I'll push this to trunk later
this week, and backport too.

-- >8 --

libstdc++-v3/ChangeLog:

* doc/xml/faq.xml: Replace viewcvs links with cgit links.
* doc/xml/manual/allocator.xml: Likewise.
* doc/xml/manual/mt_allocator.xml: Likewise.
* doc/html/*: Regenerate.
---
 libstdc++-v3/doc/html/api.html  |  2 +-
 libstdc++-v3/doc/html/faq.html  |  2 +-
 libstdc++-v3/doc/html/manual/memory.html| 10 +-
 libstdc++-v3/doc/html/manual/mt_allocator_impl.html |  6 +++---
 libstdc++-v3/doc/xml/faq.xml|  2 +-
 libstdc++-v3/doc/xml/manual/allocator.xml   | 10 +-
 libstdc++-v3/doc/xml/manual/mt_allocator.xml|  6 +++---
 7 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/libstdc++-v3/doc/html/api.html b/libstdc++-v3/doc/html/api.html
index b494d91f9af..d2cff093d8f 100644
--- a/libstdc++-v3/doc/html/api.html
+++ b/libstdc++-v3/doc/html/api.html
@@ -38,4 +38,4 @@
 
   In addition, a rendered set of man pages are available in the same
   location specified above. Start with C++Intro(3).
-Prev Up Next Home 

+Prev Up Next Home 

\ No newline at end of file
diff --git a/libstdc++-v3/doc/html/faq.html b/libstdc++-v3/doc/html/faq.html
index dcb94ba67dc..507555839f2 100644
--- a/libstdc++-v3/doc/html/faq.html
+++ b/libstdc++-v3/doc/html/faq.html
@@ -147,7 +147,7 @@
  The libstdc++ project is contributed to by several developers
  all over the world, in the same way as GCC or the Linux kernel.
  The current maintainers are listed in the
- https://gcc.gnu.org/viewcvs/gcc/trunk/MAINTAINERS?view=co"; 
target="_top">MAINTAINERS
+ https://gcc.gnu.org/cgit/gcc/tree/MAINTAINERS"; 
target="_top">MAINTAINERS
  file (look for "c++ runtime libs").
 
 Development and discussion is held on the libstdc++ mailing
diff --git a/libstdc++-v3/doc/html/manual/memory.html 
b/libstdc++-v3/doc/html/manual/memory.html
index 08ad2fd4dd8..6b707105969 100644
--- a/libstdc++-v3/doc/html/manual/memory.html
+++ b/libstdc++-v3/doc/html/manual/memory.html
@@ -120,8 +120,8 @@
Over multiple iterations, various STL container
  objects have elements inserted to some maximum amount. A variety
  of allocators are tested.
- Test source for http://gcc.gnu.org/viewcvs/gcc/trunk/libstdc%2B%2B-v3/testsuite/performance/23_containers/insert/sequence.cc?view=markup";
 target="_top">sequence
- and http://gcc.gnu.org/viewcvs/gcc/trunk/libstdc%2B%2B-v3/testsuite/performance/23_containers/insert/associative.cc?view=markup";
 target="_top">associative
+ Test source for https://gcc.gnu.org/cgit/gcc/tree/libstdc++-v3/testsuite/performance/23_containers/insert/sequence.cc";
 target="_top">sequence
+ and https://gcc.gnu.org/cgit/gcc/tree/libstdc++-v3/testsuite/performance/23_containers/insert/associative.cc";
 target="_top">associative
  containers.

Insertion and erasure in a multi-threaded environment.
@@ -130,14 +130,14 @@
  on a per-thread basis, as well as measuring thread contention
  for memory resources.
  Test source
-http://gcc.gnu.org/viewcvs/gcc/trunk/libstdc%2B%2B-v3/testsuite/performance/23_containers/insert_erase/associative.cc?view=markup";
 target="_top">here.
+https://gcc.gnu.org/cgit/gcc/tree/libstdc++-v3/testsuite/performance/23_containers/insert_erase/associative.cc";
 target="_top">here.

 A threaded producer/consumer model.

Test source for
- http://gcc.gnu.org/viewcvs/gcc/trunk/libstdc++-v3/testsuite/performance/23_containers/producer_consumer/sequence.cc?view=markup";
 target="_top">sequence
+ https://gcc.gnu.org/cgit/gcc/tree/libstdc++-v3/testsuite/performance/23_containers/producer_consumer/sequence.cc";
 target="_top">sequence
  and
- http://gcc.gnu.org/viewcvs/gcc/trunk/libstdc++-v3/testsuite/performance/23_containers/producer_consumer/associative.cc?view=markup";
 target="_top">associative
+ https://gcc.gnu.org/cgit/gcc/tree/libstdc++-v3/testsuite/performance/23_containers/producer_consumer/associative.cc";
 target="_top">associative
  containers.
  
  Since GCC 12 the default choice for
diff --git a/libstdc++-v3/doc/html/manual/mt_allocator_impl.html 
b/libstdc++-v3/doc/html/manual/mt_allocator_impl.html
index 2e5926add00..c69b9c5b55a 100644
--- a/libstdc++-v3/doc/html/manual/mt_allocator_impl.html
+++ b/libstdc++-v3/doc/html/manual/mt_allocator_impl.html
@@ -155,7 +155,7 @@ that uses it is fully constructed. For most (but not all) 
STL
 containers, this works, as an instance of the allocator is constructed
 as part of a container's constructor. However, this assumption is
 implementation-specific, and subject to change. For an example of a
-pool that frees memory, see the following
-http://gcc.gnu.org/viewcvs/gcc/trunk/libstdc++-v3/testsuite/ext/mt_allocator/deallocate_local-6.cc?vie

[PATCH v1] RISC-V: Add testcases for vector truncate after .SAT_SUB

2024-06-25 Thread pan2 . li
From: Pan Li 

This patch would like to add the test cases of the vector truncate after
.SAT_SUB.  Aka:

  #define DEF_VEC_SAT_U_SUB_TRUNC_FMT_1(OUT_T, IN_T)   \
  void __attribute__((noinline))   \
  vec_sat_u_sub_trunc_##OUT_T##_fmt_1 (OUT_T *out, IN_T *op_1, IN_T y, \
 unsigned limit) \
  {\
unsigned i;\
for (i = 0; i < limit; i++)\
  {\
IN_T x = op_1[i];  \
out[i] = (OUT_T)(x >= y ? x - y : 0);  \
  }\
  }

The below 3 cases are included.

DEF_VEC_SAT_U_SUB_TRUNC_FMT_1(uint8_t, uint16_t)
DEF_VEC_SAT_U_SUB_TRUNC_FMT_1(uint16_t, uint32_t)
DEF_VEC_SAT_U_SUB_TRUNC_FMT_1(uint32_t, uint64_t)

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h: Add helper
test macros.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_scalar.h: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-2.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-3.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-run-1.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-run-2.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-run-3.c: New 
test.

Signed-off-by: Pan Li 
---
 .../riscv/rvv/autovec/binop/vec_sat_arith.h   | 19 +
 .../rvv/autovec/binop/vec_sat_binary_scalar.h | 27 +++
 .../rvv/autovec/binop/vec_sat_u_sub_trunc-1.c | 21 ++
 .../rvv/autovec/binop/vec_sat_u_sub_trunc-2.c | 21 ++
 .../rvv/autovec/binop/vec_sat_u_sub_trunc-3.c | 21 ++
 .../autovec/binop/vec_sat_u_sub_trunc-run-1.c | 74 +++
 .../autovec/binop/vec_sat_u_sub_trunc-run-2.c | 74 +++
 .../autovec/binop/vec_sat_u_sub_trunc-run-3.c | 74 +++
 8 files changed, 331 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_scalar.h
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-3.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-run-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-run-3.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h
index d5c81fbe5a9..a3116033fb3 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h
@@ -310,4 +310,23 @@ vec_sat_u_sub_##T##_fmt_10 (T *out, T *op_1, T *op_2, 
unsigned limit) \
 #define RUN_VEC_SAT_U_SUB_FMT_10(T, out, op_1, op_2, N) \
   vec_sat_u_sub_##T##_fmt_10(out, op_1, op_2, N)
 
+/**/
+/* Saturation Sub Truncated (Unsigned and Signed) 
*/
+/**/
+#define DEF_VEC_SAT_U_SUB_TRUNC_FMT_1(OUT_T, IN_T)   \
+void __attribute__((noinline))   \
+vec_sat_u_sub_trunc_##OUT_T##_fmt_1 (OUT_T *out, IN_T *op_1, IN_T y, \
+unsigned limit) \
+{\
+  unsigned i;\
+  for (i = 0; i < limit; i++)\
+{\
+  IN_T x = op_1[i];  \
+  out[i] = (OUT_T)(x >= y ? x - y : 0);  \
+}\
+}
+
+#define RUN_VEC_SAT_U_SUB_TRUNC_FMT_1(OUT_T, IN_T, out, op_1, y, N) \
+  vec_sat_u_sub_trunc_##OUT_T##_fmt_1(out, op_1, y, N)
+
 #endif
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_scalar.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_scalar.h
new file mode 100644
index 000..c79b180054e
--- /dev/null
+++ b/gcc/testsuite/gcc.

Re: [PATCH] libstdc++: Fix --disable-libstdcxx-verbose abi break [PR115585]

2024-06-25 Thread Jonathan Wakely

Please read https://gcc.gnu.org/contribute.html#patches and ensure
you've included everything, for example ...

On 22/06/24 17:11 -0400, Shengdun Wang wrote:

__glibcxx_assert_fail is not defined when we disable
the libstdcxx-verbose. This causes ABI break when a
binary is compiled with verbose enabled.

libstdc++-v3/ChangeLog:

* src/c++11/assert_fail.cc:


This is missing a description of the change.

The PR number should be in the summary line, and the ChangeLog entry.

Patches for libstdc++ should be CC'd to the libstdc++ list:
https://gcc.gnu.org/lists.html

No DCO sign-off:
https://gcc.gnu.org/contribute.html#legal




---
libstdc++-v3/src/c++11/assert_fail.cc | 13 +
1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/src/c++11/assert_fail.cc 
b/libstdc++-v3/src/c++11/assert_fail.cc
index 6d99c7958f3..930cabd5ee6 100644
--- a/libstdc++-v3/src/c++11/assert_fail.cc
+++ b/libstdc++-v3/src/c++11/assert_fail.cc
@@ -22,23 +22,28 @@
// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
// .

+#include 
+#ifdef _GLIBCXX_VERBOSE_ASSERT
#include  // for std::fprintf, stderr
+#endif
#include // for std::abort

-#ifdef _GLIBCXX_VERBOSE_ASSERT
namespace std
{
  [[__noreturn__]]
  void
-  __glibcxx_assert_fail(const char* file, int line,
-   const char* function, const char* condition) noexcept
+  __glibcxx_assert_fail( [[maybe_unused]] const char* file,
+ [[maybe_unused]] int line,
+ [[maybe_unused]] const char* function,
+ [[maybe_unused]] const char* condition) noexcept
  {
+#ifdef _GLIBCXX_VERBOSE_ASSERT
if (file && function && condition)
  fprintf(stderr, "%s:%d: %s: Assertion '%s' failed.\n",
  file, line, function, condition);
else if (function)
  fprintf(stderr, "%s: Undefined behavior detected.\n", function);
+#endif
abort();
  }
}
-#endif




Re: [PATCH][v2] Support single def-use cycle optimization for SLP reduction vectorization

2024-06-25 Thread Richard Biener
On Tue, 25 Jun 2024, Thomas Schwinge wrote:

> Hi!
> 
> On 2024-06-14T11:08:15+0200, Richard Biener  wrote:
> > We can at least mimic single def-use cycle optimization when doing
> > single-lane SLP reductions and that's required to avoid regressing
> > compared to non-SLP.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
> >
> > * tree-vect-loop.cc (vectorizable_reduction): Allow
> > single-def-use cycles with SLP.
> > (vect_transform_reduction): Handle SLP single def-use cycles.
> > (vect_transform_cycle_phi): Likewise.
> >
> > * gcc.dg/vect/slp-reduc-12.c: New testcase.
> 
> For GCN target (tested '-march=gfx908' on current sources), I see:
> 
> +PASS: gcc.dg/vect/slp-reduc-12.c (test for excess errors)
> +FAIL: gcc.dg/vect/slp-reduc-12.c scan-tree-dump vect "using single 
> def-use cycle for reduction"
> 
> ..., where we've got (see attached):
> 
> [...]
> [...]/gcc.dg/vect/slp-reduc-12.c:10:21: optimized: loop vectorized using 
> 256 byte vectors
> [...]
> [...]/gcc.dg/vect/slp-reduc-12.c:10:21: note:   Reduce using direct 
> vector reduction.
> [...]/gcc.dg/vect/slp-reduc-12.c:10:21: note:   vectorizing stmts using 
> SLP.
> [...]
> 
> How to address?

The testcase works on the premise that we have VnDF and VmSI with
n != m but for GCN both are 64.  I'm not sure how to gate the dump
scanning properly but there must be precedence?

Richard.

> 
> Grüße
>  Thomas
> 
> 
> >  gcc/testsuite/gcc.dg/vect/slp-reduc-12.c | 18 ++
> >  gcc/tree-vect-loop.cc| 45 ++--
> >  2 files changed, 45 insertions(+), 18 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/vect/slp-reduc-12.c
> >
> > diff --git a/gcc/testsuite/gcc.dg/vect/slp-reduc-12.c 
> > b/gcc/testsuite/gcc.dg/vect/slp-reduc-12.c
> > new file mode 100644
> > index 000..625f8097c54
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/vect/slp-reduc-12.c
> > @@ -0,0 +1,18 @@
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target vect_double } */
> > +/* { dg-require-effective-target vect_int } */
> > +/* { dg-require-effective-target vect_hw_misalign } */
> > +/* { dg-additional-options "-Ofast" } */
> > +
> > +double foo (double *x, int * __restrict a, int n)
> > +{
> > +  double r = 0.;
> > +  for (int i = 0; i < n; ++i)
> > +{
> > +  a[i] = a[i] + i;
> > +  r += x[i];
> > +}
> > +  return r;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump "using single def-use cycle for reduction" 
> > "vect" } } */
> > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> > index bbd5d261907..d9a2ad69484 100644
> > --- a/gcc/tree-vect-loop.cc
> > +++ b/gcc/tree-vect-loop.cc
> > @@ -8320,7 +8320,11 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
> > participating.  When unrolling we want each unrolled iteration to have 
> > its
> > own reduction accumulator since one of the main goals of unrolling a
> > reduction is to reduce the aggregate loop-carried latency.  */
> > -  if (ncopies > 1
> > +  if ((ncopies > 1
> > +   || (slp_node
> > +  && !REDUC_GROUP_FIRST_ELEMENT (stmt_info)
> > +  && SLP_TREE_LANES (slp_node) == 1
> > +  && vect_get_num_copies (loop_vinfo, vectype_in) > 1))
> >&& (STMT_VINFO_RELEVANT (stmt_info) <= vect_used_only_live)
> >&& reduc_chain_length == 1
> >&& loop_vinfo->suggested_unroll_factor == 1)
> > @@ -8373,6 +8377,10 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
> > single_defuse_cycle = false;
> > }
> >  }
> > +  if (dump_enabled_p () && single_defuse_cycle)
> > +dump_printf_loc (MSG_NOTE, vect_location,
> > +"using single def-use cycle for reduction by reducing "
> > +"multiple vectors to one in the loop body\n");
> >STMT_VINFO_FORCE_SINGLE_CYCLE (reduc_info) = single_defuse_cycle;
> >  
> >/* If the reduction stmt is one of the patterns that have lane
> > @@ -8528,9 +8536,8 @@ vect_transform_reduction (loop_vec_info loop_vinfo,
> >  {
> >tree vectype_out = STMT_VINFO_VECTYPE (stmt_info);
> >class loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
> > -  int i;
> > -  int ncopies;
> > -  int vec_num;
> > +  unsigned ncopies;
> > +  unsigned vec_num;
> >  
> >stmt_vec_info reduc_info = info_for_reduction (loop_vinfo, stmt_info);
> >gcc_assert (reduc_info->is_reduc_info);
> > @@ -8577,7 +8584,6 @@ vect_transform_reduction (loop_vec_info loop_vinfo,
> >auto_vec vec_oprnds0;
> >auto_vec vec_oprnds1;
> >auto_vec vec_oprnds2;
> > -  tree def0;
> >  
> >if (dump_enabled_p ())
> >  dump_printf_loc (MSG_NOTE, vect_location, "transform reduction.\n");
> > @@ -8652,20 +8658,21 @@ vect_transform_reduction (loop_vec_info loop_vinfo,
> >   definition.  */
> >if (single_defuse_cycle)
> >  {
> > -  gcc_assert (!slp_node);
> > -  vect_get_vec_defs_for_operand (loop_vinfo, stmt_info, 1,
> > -op.ops

Re: [PATCH V2] [x86] Optimize a < 0 ? -1 : 0 to (signed)a >> 31.

2024-06-25 Thread Richard Biener
On Mon, Jun 24, 2024 at 1:28 AM liuhongt  wrote:
>
> > I think the check for TYPE_UNSIGNED should be of TREE_TYPE (@0) rather
> > than type here.
>
> Changed
>
> > Or maybe you need `types_match (type, TREE_TYPE (@0))` too.
> And use tree_nop_conversion_p (type, TREE_TYPE (@0)) and add view_convert to 
> rshift.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
>
> Try to optimize x < 0 ? -1 : 0 into (signed) x >> 31
> and x < 0 ? 1 : 0 into (unsigned) x >> 31.
>
> Move the optimization did in ix86_expand_int_vcond to match.pd
>
> gcc/ChangeLog:
>
> PR target/114189
> * match.pd: Simplify a < 0 ? -1 : 0 to (signed) >> 31 and a <
> 0 ? 1 : 0 to (unsigned) a >> 31 for vector integer type.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/avx2-pr115517.c: New test.
> * gcc.target/i386/avx512-pr115517.c: New test.
> * g++.target/i386/avx2-pr115517.C: New test.
> * g++.target/i386/avx512-pr115517.C: New test.
> * g++.dg/tree-ssa/pr88152-1.C: Adjust testcase.
> ---
>  gcc/match.pd  | 31 
>  gcc/testsuite/g++.dg/tree-ssa/pr88152-1.C |  2 +-
>  gcc/testsuite/g++.target/i386/avx2-pr115517.C | 60 
>  .../g++.target/i386/avx512-pr115517.C | 70 +++
>  gcc/testsuite/gcc.target/i386/avx2-pr115517.c | 33 +
>  .../gcc.target/i386/avx512-pr115517.c | 70 +++
>  6 files changed, 265 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.target/i386/avx2-pr115517.C
>  create mode 100644 gcc/testsuite/g++.target/i386/avx512-pr115517.C
>  create mode 100644 gcc/testsuite/gcc.target/i386/avx2-pr115517.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/avx512-pr115517.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 3d0689c9312..1d10451d0de 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -5927,6 +5927,37 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> (if (VECTOR_INTEGER_TYPE_P (type)
> && target_supports_op_p (type, MINMAX, optab_vector))
>  (minmax @0 @1
> +
> +/* Try to optimize x < 0 ? -1 : 0 into (signed) x >> 31
> +   and x < 0 ? 1 : 0 into (unsigned) x >> 31.  */
> +(simplify
> +  (vec_cond (lt @0 integer_zerop) integer_all_onesp integer_zerop)
> +   (if (VECTOR_INTEGER_TYPE_P (TREE_TYPE (@0))
> +   && !TYPE_UNSIGNED (TREE_TYPE (@0))
> +   && tree_nop_conversion_p (type, TREE_TYPE (@0))
> +   && target_supports_op_p (TREE_TYPE (@0), RSHIFT_EXPR, optab_scalar))
> +(with
> +  {
> +   unsigned int prec = element_precision (TREE_TYPE (@0));
> +  }
> +(view_convert:type

:type is unnecessary here (it's auto-deduced)

> +  (rshift @0 { build_int_cst (integer_type_node, prec - 1);})
> +
> +(simplify
> +  (vec_cond (lt @0 integer_zerop) integer_onep integer_zerop)
> +   (if (VECTOR_INTEGER_TYPE_P (TREE_TYPE (@0))
> +   && !TYPE_UNSIGNED (TREE_TYPE (@0))
> +   && tree_nop_conversion_p (type, TREE_TYPE (@0))
> +   && target_supports_op_p (unsigned_type_for (TREE_TYPE (@0)),
> +   RSHIFT_EXPR, optab_scalar))
> +(with
> +  {
> +   unsigned int prec = element_precision (TREE_TYPE (@0));
> +   tree utype = unsigned_type_for (TREE_TYPE (@0));
> +  }

please put the target_supports_op_p check here to be able to re-use utype.

> +(view_convert:type

see above.  OK with those changes.

Thanks,
Richard.

> +  (rshift (view_convert:utype @0)
> + { build_int_cst (integer_type_node, prec - 1);})
>  #endif
>
>  (for cnd (cond vec_cond)
> diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr88152-1.C 
> b/gcc/testsuite/g++.dg/tree-ssa/pr88152-1.C
> index 423ec897c1d..21299b886f0 100644
> --- a/gcc/testsuite/g++.dg/tree-ssa/pr88152-1.C
> +++ b/gcc/testsuite/g++.dg/tree-ssa/pr88152-1.C
> @@ -1,7 +1,7 @@
>  // PR target/88152
>  // { dg-do compile }
>  // { dg-options "-O2 -std=c++14 -fdump-tree-forwprop1" }
> -// { dg-final { scan-tree-dump-times " (?:<|>=) \{ 0\[, ]" 120 "forwprop1" } 
> }
> +// { dg-final { scan-tree-dump-times " (?:(?:<|>=) \{ 0\[, \]|>> 
> (?:7|15|31|63))" 120 "forwprop1" } }
>
>  template 
>  using V [[gnu::vector_size (sizeof (T) * N)]] = T;
> diff --git a/gcc/testsuite/g++.target/i386/avx2-pr115517.C 
> b/gcc/testsuite/g++.target/i386/avx2-pr115517.C
> new file mode 100644
> index 000..ec000c57542
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/i386/avx2-pr115517.C
> @@ -0,0 +1,60 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mavx2 -O2" } */
> +/* { dg-final { scan-assembler-times "vpsrlq" 2 } } */
> +/* { dg-final { scan-assembler-times "vpsrld" 2 } } */
> +/* { dg-final { scan-assembler-times "vpsrlw" 2 } } */
> +
> +typedef short v8hi __attribute__((vector_size(16)));
> +typedef short v16hi __attribute__((vector_size(32)));
> +typedef int v4si __attribute__((vector_size(16)));
> +typedef int v8si __attribute__((vector_size(32)));
> +typedef long

RE: [PATCH][ivopts]: use affine_tree when comparing IVs during candidate selection [PR114932]

2024-06-25 Thread Richard Biener
On Mon, 24 Jun 2024, Tamar Christina wrote:

> 
> 
> > -Original Message-
> > From: Richard Biener 
> > Sent: Thursday, June 20, 2024 8:49 AM
> > To: Tamar Christina 
> > Cc: gcc-patches@gcc.gnu.org; nd ; bin.ch...@linux.alibaba.com
> > Subject: RE: [PATCH][ivopts]: use affine_tree when comparing IVs during 
> > candidate
> > selection [PR114932]
> > 
> > On Wed, 19 Jun 2024, Tamar Christina wrote:
> > 
> > > > -Original Message-
> > > > From: Richard Biener 
> > > > Sent: Wednesday, June 19, 2024 12:55 PM
> > > > To: Tamar Christina 
> > > > Cc: gcc-patches@gcc.gnu.org; nd ;
> > bin.ch...@linux.alibaba.com
> > > > Subject: Re: [PATCH][ivopts]: use affine_tree when comparing IVs during
> > candidate
> > > > selection [PR114932]
> > > >
> > > > On Fri, 14 Jun 2024, Tamar Christina wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > IVOPTS normally uses affine trees to perform comparisons between 
> > > > > different
> > IVs,
> > > > > but these seem to have been missing in two key spots and instead 
> > > > > normal
> > tree
> > > > > equivalencies used.
> > > > >
> > > > > In some cases where we have a structural equivalence but not a 
> > > > > signedness
> > > > > equivalencies we end up generating both a signed and unsigned IV for 
> > > > > the
> > same
> > > > > candidate.
> > > > >
> > > > > This happens quite a lot with fortran but can also happen in C 
> > > > > because this
> > came
> > > > > code is unable to figure out when one expression is a multiple of 
> > > > > another.
> > > > >
> > > > > As an example in the attached testcase we get:
> > > > >
> > > > > Initial set of candidates:
> > > > >   cost: 24 (complexity 3)
> > > > >   reg_cost: 9
> > > > >   cand_cost: 15
> > > > >   cand_group_cost: 0 (complexity 3)
> > > > >   candidates: 1, 6, 8
> > > > >group:0 --> iv_cand:6, cost=(0,1)
> > > > >group:1 --> iv_cand:1, cost=(0,0)
> > > > >group:2 --> iv_cand:8, cost=(0,1)
> > > > >group:3 --> iv_cand:8, cost=(0,1)
> > > > >   invariant variables: 6
> > > > >   invariant expressions: 1, 2
> > > > >
> > > > > :
> > > > > inv_expr 1: stride.3_27 * 4
> > > > > inv_expr 2: (unsigned long) stride.3_27 * 4
> > > > >
> > > > > These end up being used in the same group:
> > > > >
> > > > > Group 1:
> > > > > cand  costcompl.  inv.expr.   inv.vars
> > > > > 1 0   0   NIL;6
> > > > > 2 0   0   NIL;6
> > > > > 3 0   0   NIL;6
> > > > >
> > > > > which ends up with IV opts picking the signed and unsigned IVs:
> > > > >
> > > > > Improved to:
> > > > >   cost: 24 (complexity 3)
> > > > >   reg_cost: 9
> > > > >   cand_cost: 15
> > > > >   cand_group_cost: 0 (complexity 3)
> > > > >   candidates: 1, 6, 8
> > > > >group:0 --> iv_cand:6, cost=(0,1)
> > > > >group:1 --> iv_cand:1, cost=(0,0)
> > > > >group:2 --> iv_cand:8, cost=(0,1)
> > > > >group:3 --> iv_cand:8, cost=(0,1)
> > > > >   invariant variables: 6
> > > > >   invariant expressions: 1, 2
> > > > >
> > > > > and so generates the same IV as both signed and unsigned:
> > > > >
> > > > > ;;   basic block 21, loop depth 3, count 214748368 (estimated 
> > > > > locally, freq
> > > > 58.2545), maybe hot
> > > > > ;;prev block 28, next block 31, flags: (NEW, REACHABLE, VISITED)
> > > > > ;;pred:   28 [always]  count:23622320 (estimated locally, 
> > > > > freq 6.4080)
> > > > (FALLTHRU,EXECUTABLE)
> > > > > ;;25 [always]  count:191126046 (estimated locally, 
> > > > > freq 51.8465)
> > > > (FALLTHRU,DFS_BACK,EXECUTABLE)
> > > > >   # .MEM_66 = PHI <.MEM_34(28), .MEM_22(25)>
> > > > >   # ivtmp.22_41 = PHI <0(28), ivtmp.22_82(25)>
> > > > >   # ivtmp.26_51 = PHI 
> > > > >   # ivtmp.28_90 = PHI 
> > > > >
> > > > > ...
> > > > >
> > > > > ;;   basic block 24, loop depth 3, count 214748366 (estimated 
> > > > > locally, freq
> > > > 58.2545), maybe hot
> > > > > ;;prev block 22, next block 25, flags: (NEW, REACHABLE, VISITED)'
> > > > > ;;pred:   22 [always]  count:95443719 (estimated locally, 
> > > > > freq 25.8909)
> > > > (FALLTHRU)
> > > > ;;21 [33.3% (guessed)]  count:71582790 (estimated 
> > > > locally, freq
> > 19.4182)
> > > > (TRUE_VALUE,EXECUTABLE)
> > > > ;;31 [33.3% (guessed)]  count:47721860 (estimated 
> > > > locally, freq
> > 12.9455)
> > > > (TRUE_VALUE,EXECUTABLE)
> > > > # .MEM_22 = PHI <.MEM_44(22), .MEM_31(21), .MEM_79(31)>
> > > >
> > > > 
> > > >ivtmp.22_82 = ivtmp.22_41 + 1;
> > > > ivtmp.26_72 = ivtmp.26_51 + _80;
> > > > ivtmp.28_98 = ivtmp.28_90 + _39;
> > > > >
> > > > > These two IVs are always used as unsigned, so IV ops generates:
> > > > >
> > > > >   _73 = stride.3_27 * 4;
> > > > >   _80 = (unsigned long) _73;
> > > > >   _54 = (unsigned long) stride.3_27;
> > > > >   _39 = _54 * 4;
> > > > >
> > > > > Which means that in e.g. exchange2 we generate a lot of du

Re: [PATCH] Add param for bb limit to invoke fast_vrp.

2024-06-25 Thread Andrew MacLeod



On 6/24/24 22:35, Andrew Pinski wrote:

On Mon, Jun 24, 2024 at 7:20 PM Andrew MacLeod  wrote:

// Fill ssa-cache R with any outgoing ranges on edge E, using QUERY.
bool gori_on_edge (class ssa_cache &r, edge e, range_query *query =
NULL);

This is what the fast_vrp routines uses.  We can gather all range
restrictions generated from an edge efficiently just once and then
intersect them with a known range as we walk the different paths. We
don't need the gori exports , nor any of the other on-demand bits where
we calculate each export range dynamically.. I suspect it would reduce
the workload and memory impact quite a bit, but I'm not really familiar
with exactly how the threader uses those things.

It'd require some minor tweaking to the lazy_ssa_cache to make the
bitmap of names set accessible. This  would provide similar
functionality to what the gori export () routine provides.  Both
relations and inferred ranges should only need to be calculated once per
block as well and could/should/would be applied the same way if they are
present.   I don't *think* the threader uses any of the def chains, but
Aldy can chip in.

+   warning (OPT_Wdisabled_optimization,
+"Using fast VRP algorithm. %d basic blocks"
+" exceeds %s%d limit",
+n_basic_blocks_for_fn (fun),
+"--param=vrp-block-limit=",
+param_vrp_block_limit);

This should be:
warning (OPT_Wdisabled_optimization, "Using fast VRP algorithm. %d basic blocks"
 " exceeds %<%--param=vrp-block-limit=d%> limit",
n_basic_blocks_for_fn (fun), param_vrp_block_limit);

I had thought it was mentioned that options should be quoted but it is
not mentioned in the coding conventions:
https://gcc.gnu.org/codingconventions.html#Diagnostics

But it is mentioned in
https://inbox.sourceware.org/gcc/2d2bd844-2de4-ecff-7a07-b22350750...@gmail.com/
; This is why you were getting an error as you mentioned on IRC.


I didnt do that because when I did, everything in bracketed by the %< %> 
in the warning came out in bold text.  is that the intended effect?



Andrew.



Re: [PATCH 1/3 v4] vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

2024-06-25 Thread Richard Biener
On Tue, 25 Jun 2024, Hu, Lin1 wrote:

> Hi,
> 
> This is the current version. 
> 
> I haven't made any major changes to the original code, I think it will have 
> less impact on your code. And I think the current API is sufficient to 
> support the mode selection you mentioned, if you have any concerns you can 
> mention them. I can tweak it further.
> 
> BRs,
> Lin
> 
> gcc/ChangeLog:
> 
>   PR target/107432
>   * tree-vect-generic.cc
>   (expand_vector_conversion): Support convert for int -> int,
>   float -> float and int <-> float.
>   * tree-vect-stmts.cc (vectorizable_conversion): Wrap the
>   indirect convert part.
>   (supportable_indirect_convert_operation): New function.
>   * tree-vectorizer.h (supportable_indirect_convert_operation):
>   Define the new function.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR target/107432
>   * gcc.target/i386/pr107432-1.c: New test.
>   * gcc.target/i386/pr107432-2.c: Ditto.
>   * gcc.target/i386/pr107432-3.c: Ditto.
>   * gcc.target/i386/pr107432-4.c: Ditto.
>   * gcc.target/i386/pr107432-5.c: Ditto.
>   * gcc.target/i386/pr107432-6.c: Ditto.
>   * gcc.target/i386/pr107432-7.c: Ditto.
> ---
>  gcc/testsuite/gcc.target/i386/pr107432-1.c | 234 +++
>  gcc/testsuite/gcc.target/i386/pr107432-2.c | 105 +
>  gcc/testsuite/gcc.target/i386/pr107432-3.c |  55 +
>  gcc/testsuite/gcc.target/i386/pr107432-4.c |  56 +
>  gcc/testsuite/gcc.target/i386/pr107432-5.c |  72 ++
>  gcc/testsuite/gcc.target/i386/pr107432-6.c | 139 +++
>  gcc/testsuite/gcc.target/i386/pr107432-7.c | 150 
>  gcc/tree-vect-generic.cc   |  34 ++-
>  gcc/tree-vect-stmts.cc | 259 ++---
>  gcc/tree-vectorizer.h  |   4 +
>  10 files changed, 1013 insertions(+), 95 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr107432-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr107432-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr107432-3.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr107432-4.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr107432-5.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr107432-6.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr107432-7.c
> 
> diff --git a/gcc/testsuite/gcc.target/i386/pr107432-1.c 
> b/gcc/testsuite/gcc.target/i386/pr107432-1.c
> new file mode 100644
> index 000..a4f37447eb4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr107432-1.c
> @@ -0,0 +1,234 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=x86-64 -mavx512bw -mavx512vl -O3" } */
> +/* { dg-final { scan-assembler-times "vpmovqd" 6 } } */
> +/* { dg-final { scan-assembler-times "vpmovqw" 6 } } */
> +/* { dg-final { scan-assembler-times "vpmovqb" 6 } } */
> +/* { dg-final { scan-assembler-times "vpmovdw" 6 { target { ia32 } } } } */
> +/* { dg-final { scan-assembler-times "vpmovdw" 8 { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "vpmovdb" 6 { target { ia32 } } } } */
> +/* { dg-final { scan-assembler-times "vpmovdb" 8 { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "vpmovwb" 8 } } */
> +
> +#include 
> +
> +typedef short __v2hi __attribute__ ((__vector_size__ (4)));
> +typedef char __v2qi __attribute__ ((__vector_size__ (2)));
> +typedef char __v4qi __attribute__ ((__vector_size__ (4)));
> +typedef char __v8qi __attribute__ ((__vector_size__ (8)));
> +
> +typedef unsigned short __v2hu __attribute__ ((__vector_size__ (4)));
> +typedef unsigned short __v4hu __attribute__ ((__vector_size__ (8)));
> +typedef unsigned char __v2qu __attribute__ ((__vector_size__ (2)));
> +typedef unsigned char __v4qu __attribute__ ((__vector_size__ (4)));
> +typedef unsigned char __v8qu __attribute__ ((__vector_size__ (8)));
> +typedef unsigned int __v2su __attribute__ ((__vector_size__ (8)));
> +
> +__v2si mm_cvtepi64_epi32_builtin_convertvector(__m128i a)
> +{
> +  return __builtin_convertvector((__v2di)a, __v2si);
> +}
> +
> +__m128i  mm256_cvtepi64_epi32_builtin_convertvector(__m256i a)
> +{
> +  return (__m128i)__builtin_convertvector((__v4di)a, __v4si);
> +}
> +
> +__m256i  mm512_cvtepi64_epi32_builtin_convertvector(__m512i a)
> +{
> +  return (__m256i)__builtin_convertvector((__v8di)a, __v8si);
> +}
> +
> +__v2hi   mm_cvtepi64_epi16_builtin_convertvector(__m128i a)
> +{
> +  return __builtin_convertvector((__v2di)a, __v2hi);
> +}
> +
> +__v4hi   mm256_cvtepi64_epi16_builtin_convertvector(__m256i a)
> +{
> +  return __builtin_convertvector((__v4di)a, __v4hi);
> +}
> +
> +__m128i  mm512_cvtepi64_epi16_builtin_convertvector(__m512i a)
> +{
> +  return (__m128i)__builtin_convertvector((__v8di)a, __v8hi);
> +}
> +
> +__v2qi   mm_cvtepi64_epi8_builtin_convertvector(__m128i a)
> +{
> +  return __builtin_convertvector((__v2di)a, __v2qi);
> +}
> +
> +__v4qi   mm256_cvtepi64_epi8_builtin_convert

[PATCH] tree-optimization/115629 - missed tail merging

2024-06-25 Thread Richard Biener
The following fixes a missed tail-merging observed for the testcase
in PR115629.  The issue is that when deps_ok_for_redirect doesn't
compute both would be valid prevailing blocks it rejects the merge.
The following instead makes sure to record the working block as
prevailing.  Also stmt comparison fails for indirect references
and is not handling memory references thoroughly, failing to unify
array indices and pointers indirected.  The following attempts to
fix this.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

I'll push tomorrow if there are no comments or failures reported from CI.

PR tree-optimization/115629
* tree-ssa-tail-merge.cc (gimple_equal_p): Handle
memory references better.
(deps_ok_for_redirect): Handle the case not both blocks
are considered a valid prevailing block.

* gcc.dg/tree-ssa/tail-merge-1.c: New testcase.
---
 gcc/testsuite/gcc.dg/tree-ssa/tail-merge-1.c | 14 
 gcc/tree-ssa-tail-merge.cc   | 69 +---
 2 files changed, 75 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/tail-merge-1.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/tail-merge-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/tail-merge-1.c
new file mode 100644
index 000..e5670c33ba3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/tail-merge-1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dce4" } */
+
+void foo1 (int *restrict a, int *restrict b, int *restrict c,
+  int *restrict d, int *restrict res, int n)
+{
+  for (int i = 0; i < n; i++)
+res[i] = a[i] ? b[i] : (c[i] ? b[i] : d[i]);
+}
+
+/* After tail-merging (run during PRE) we should end up merging the two
+   blocks dereferencing 'b', ending up with two iftmp assigns and the
+   iftmp PHI def.  */
+/* { dg-final { scan-tree-dump-times "iftmp\[^\r\n\]* = " 3 "dce4" } } */
diff --git a/gcc/tree-ssa-tail-merge.cc b/gcc/tree-ssa-tail-merge.cc
index c8b4a79294d..27e7c6a37b2 100644
--- a/gcc/tree-ssa-tail-merge.cc
+++ b/gcc/tree-ssa-tail-merge.cc
@@ -1188,7 +1188,52 @@ gimple_equal_p (same_succ *same_succ, gimple *s1, gimple 
*s2)
{
  t1 = gimple_arg (s1, i);
  t2 = gimple_arg (s2, i);
- if (!gimple_operand_equal_value_p (t1, t2))
+ while (handled_component_p (t1) && handled_component_p (t2))
+   {
+ if (TREE_CODE (t1) != TREE_CODE (t2)
+ || TREE_THIS_VOLATILE (t1) != TREE_THIS_VOLATILE (t2))
+   return false;
+ switch (TREE_CODE (t1))
+   {
+   case COMPONENT_REF:
+ if (TREE_OPERAND (t1, 1) != TREE_OPERAND (t2, 1)
+ || !gimple_operand_equal_value_p (TREE_OPERAND (t1, 2),
+   TREE_OPERAND (t2, 2)))
+   return false;
+ break;
+   case ARRAY_REF:
+   case ARRAY_RANGE_REF:
+ if (!gimple_operand_equal_value_p (TREE_OPERAND (t1, 3),
+TREE_OPERAND (t2, 3)))
+   return false;
+ /* Fallthru.  */
+   case BIT_FIELD_REF:
+ if (!gimple_operand_equal_value_p (TREE_OPERAND (t1, 1),
+TREE_OPERAND (t2, 1))
+ || !gimple_operand_equal_value_p (TREE_OPERAND (t1, 2),
+   TREE_OPERAND (t2, 2)))
+   return false;
+ break;
+   case REALPART_EXPR:
+   case IMAGPART_EXPR:
+   case VIEW_CONVERT_EXPR:
+ break;
+   default:
+   gcc_unreachable ();
+   }
+ t1 = TREE_OPERAND (t1, 0);
+ t2 = TREE_OPERAND (t2, 0);
+   }
+ if (TREE_CODE (t1) == MEM_REF && TREE_CODE (t2) == MEM_REF)
+   {
+ if (TREE_THIS_VOLATILE (t1) != TREE_THIS_VOLATILE (t2)
+ || TYPE_ALIGN (TREE_TYPE (t1)) != TYPE_ALIGN (TREE_TYPE (t2))
+ || !gimple_operand_equal_value_p (TREE_OPERAND (t1, 0),
+   TREE_OPERAND (t2, 0))
+ || TREE_OPERAND (t1, 1) != TREE_OPERAND (t2, 1))
+   return false;
+   }
+ else if (!gimple_operand_equal_value_p (t1, t2))
return false;
}
   return true;
@@ -1462,16 +1507,24 @@ deps_ok_for_redirect_from_bb_to_bb (basic_block from, 
basic_block to)
replacement are dominates by their defs.  */
 
 static bool
-deps_ok_for_redirect (basic_block bb1, basic_block bb2)
+deps_ok_for_redirect (basic_block &bb1, basic_block &bb2)
 {
-  if (BB_CLUSTER (bb1) != NULL)
-bb1 = BB_CLUSTER (bb1)->rep_bb;
+  basic_block b1 = bb1;
+  basic_block b2 = bb2;
+  if (BB_CLUSTER (b1) != NULL)
+b1 = BB_CLUSTER (b1)->rep_bb;
 
-  i

[PATCH] GORI cleanups

2024-06-25 Thread Richard Biener
The following replaces conditional is_export_p calls as is_export_p
handles a NULL bb itself.

Bootstrap running on x86_64-unknown-linux-gnu, OK?

Thanks,
Richard.

* gimple-range-gori.cc (gori_compute::may_recompute_p):
Call is_export_p with NULL bb.
---
 gcc/gimple-range-gori.cc | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/gcc/gimple-range-gori.cc b/gcc/gimple-range-gori.cc
index 275283a424f..a31e3be65f7 100644
--- a/gcc/gimple-range-gori.cc
+++ b/gcc/gimple-range-gori.cc
@@ -1332,18 +1332,14 @@ gori_compute::may_recompute_p (tree name, basic_block 
bb, int depth)
  gcc_checking_assert (depth >= 1);
}
 
-  bool res = (bb ? m_map.is_export_p (dep1, bb)
-: m_map.is_export_p (dep1));
+  bool res = m_map.is_export_p (dep1, bb);
   if (res || depth <= 1)
return res;
   // Check another level of recomputation.
   return may_recompute_p (dep1, bb, --depth);
 }
   // Two dependencies terminate the depth of the search.
-  if (bb)
-return m_map.is_export_p (dep1, bb) || m_map.is_export_p (dep2, bb);
-  else
-return m_map.is_export_p (dep1) || m_map.is_export_p (dep2);
+  return m_map.is_export_p (dep1, bb) || m_map.is_export_p (dep2, bb);
 }
 
 // Return TRUE if NAME can be recomputed on edge E.  If any direct dependent
-- 
2.35.3


Pushed: [PATCH] doc: gccint: Fix typos in jump_table_data description

2024-06-25 Thread Xi Ruoyao
gcc/ChangeLog:

* doc/rtl.texi (jump_table_data): Fix typos.
---

Pushed as obvious.

 gcc/doc/rtl.texi | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
index c1717ab5f6b..a1ede418c21 100644
--- a/gcc/doc/rtl.texi
+++ b/gcc/doc/rtl.texi
@@ -3872,8 +3872,8 @@ To set the kind of a label, use the @code{SET_LABEL_KIND} 
macro.
 @item jump_table_data
 A @code{jump_table_data} insn is a placeholder for the jump-table data
 of a @code{casesi} or @code{tablejump} insn.  They are placed after
-a @code{tablejump_p} insn.  A @code{jump_table_data} insn is not part o
-a basic blockm but it is associated with the basic block that ends with
+a @code{tablejump_p} insn.  A @code{jump_table_data} insn is not part of
+a basic block but it is associated with the basic block that ends with
 the @code{tablejump_p} insn.  The @code{PATTERN} of a @code{jump_table_data}
 is always either an @code{addr_vec} or an @code{addr_diff_vec}, and a
 @code{jump_table_data} insn is always preceded by a @code{code_label}.
-- 
2.45.2



Re: [PATCH] GORI cleanups

2024-06-25 Thread Andrew MacLeod



On 6/25/24 09:44, Richard Biener wrote:

The following replaces conditional is_export_p calls as is_export_p
handles a NULL bb itself.

Bootstrap running on x86_64-unknown-linux-gnu, OK?


Absolutely.

Thanks
Andrew



Thanks,
Richard.

* gimple-range-gori.cc (gori_compute::may_recompute_p):
Call is_export_p with NULL bb.
---
  gcc/gimple-range-gori.cc | 8 ++--
  1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/gcc/gimple-range-gori.cc b/gcc/gimple-range-gori.cc
index 275283a424f..a31e3be65f7 100644
--- a/gcc/gimple-range-gori.cc
+++ b/gcc/gimple-range-gori.cc
@@ -1332,18 +1332,14 @@ gori_compute::may_recompute_p (tree name, basic_block 
bb, int depth)
  gcc_checking_assert (depth >= 1);
}
  
-  bool res = (bb ? m_map.is_export_p (dep1, bb)

-: m_map.is_export_p (dep1));
+  bool res = m_map.is_export_p (dep1, bb);
if (res || depth <= 1)
return res;
// Check another level of recomputation.
return may_recompute_p (dep1, bb, --depth);
  }
// Two dependencies terminate the depth of the search.
-  if (bb)
-return m_map.is_export_p (dep1, bb) || m_map.is_export_p (dep2, bb);
-  else
-return m_map.is_export_p (dep1) || m_map.is_export_p (dep2);
+  return m_map.is_export_p (dep1, bb) || m_map.is_export_p (dep2, bb);
  }
  
  // Return TRUE if NAME can be recomputed on edge E.  If any direct dependent




Re: [PATCH] Hard register asm constraint

2024-06-25 Thread Paul Koning



> On Jun 24, 2024, at 1:50 AM, Stefan Schulze Frielinghaus 
>  wrote:
> 
> Ping.
> 
> On Mon, Jun 10, 2024 at 07:19:19AM +0200, Stefan Schulze Frielinghaus wrote:
>> Ping.
>> 
>> On Fri, May 24, 2024 at 11:13:12AM +0200, Stefan Schulze Frielinghaus wrote:
>>> This implements hard register constraints for inline asm.  A hard register
>>> constraint is of the form {regname} where regname is any valid register.  
>>> This
>>> basically renders register asm superfluous.  For example, the snippet
>>> 
>>> int test (int x, int y)
>>> {
>>>  register int r4 asm ("r4") = x;
>>>  register int r5 asm ("r5") = y;
>>>  unsigned int copy = y;
>>>  asm ("foo %0,%1,%2" : "+d" (r4) : "d" (r5), "d" (copy));
>>>  return r4;
>>> }
>>> 
>>> could be rewritten into
>>> 
>>> int test (int x, int y)
>>> {
>>>  asm ("foo %0,%1,%2" : "+{r4}" (x) : "{r5}" (y), "d" (y));
>>>  return x;
>>> }

I like this idea but I'm wondering: regular constraints specify what sort of 
value is needed, for example an int vs. a short int vs. a float.  The notation 
you've shown doesn't seem to have that aspect.

The other comment is that I didn't see documentation updates to reflect this 
new feature.

paul



[PATCH] late-combine: Honor targetm.cannot_copy_insn_p

2024-06-25 Thread Richard Sandiford
late-combine was failing to take targetm.cannot_copy_insn_p into
account, which led to multiple definitions of PIC symbols on
arm*-*-* targets.

Currently bootstrapping & regression testing on arm-linux-gnueabihf
and aarch64-linus-gnu.  It should fix the bootstrap-lto problem
reported by Linaro's CI.  OK to install if testing passes?

Richard


gcc/
* late-combine.cc (insn_combination::substitute_nondebug_use):
Reject second and subsequent uses if targetm.cannot_copy_insn_p
disallows copying.
---
 gcc/late-combine.cc | 12 
 1 file changed, 12 insertions(+)

diff --git a/gcc/late-combine.cc b/gcc/late-combine.cc
index fc75d1c56d7..b7c0bc07a8b 100644
--- a/gcc/late-combine.cc
+++ b/gcc/late-combine.cc
@@ -179,6 +179,18 @@ insn_combination::substitute_nondebug_use (use_info *use)
   if (dump_file && (dump_flags & TDF_DETAILS))
 dump_insn_slim (dump_file, use->insn ()->rtl ());
 
+  // Reject second and subsequent uses if the target does not allow
+  // the defining instruction to be copied.
+  if (targetm.cannot_copy_insn_p
+  && m_nondebug_changes.length () >= 2
+  && targetm.cannot_copy_insn_p (m_def_insn->rtl ()))
+{
+  if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file, "-- The target does not allow multiple"
+" copies of insn %d\n", m_def_insn->uid ());
+  return false;
+}
+
   // Check that we can change the instruction pattern.  Leave recognition
   // of the result till later.
   insn_propagation prop (use_rtl, m_dest, m_src);
-- 
2.25.1



[PATCH] tree-optimization/115646 - ICE with pow shrink-wrapping from bitfield

2024-06-25 Thread Richard Biener
The following makes analysis and transform agree on constraints.

Bootstrap and regtest pending on x86_64-unknown-linux-gnu.

PR tree-optimization/115646
* tree-call-cdce.cc (check_pow): Check for bit_sz values
as allowed by transform.

* gcc.dg/pr115646.c: New testcase.
---
 gcc/testsuite/gcc.dg/pr115646.c | 13 +
 gcc/tree-call-cdce.cc   |  2 +-
 2 files changed, 14 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr115646.c

diff --git a/gcc/testsuite/gcc.dg/pr115646.c b/gcc/testsuite/gcc.dg/pr115646.c
new file mode 100644
index 000..24bc1e4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr115646.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+extern double pow(double x, double y);
+
+struct S {
+unsigned int a : 3, b : 8, c : 21;
+};
+
+void foo (struct S *p)
+{
+  pow (p->c, 42);
+}
diff --git a/gcc/tree-call-cdce.cc b/gcc/tree-call-cdce.cc
index 7f67a0b2dc6..befe6acf178 100644
--- a/gcc/tree-call-cdce.cc
+++ b/gcc/tree-call-cdce.cc
@@ -260,7 +260,7 @@ check_pow (gcall *pow_call)
   /* If the type of the base is too wide,
  the resulting shrink wrapping condition
 will be too conservative.  */
-  if (bit_sz > MAX_BASE_INT_BIT_SIZE)
+  if (bit_sz != 8 && bit_sz != 16 && bit_sz != MAX_BASE_INT_BIT_SIZE)
 return false;
 
   return true;
-- 
2.35.3


Re: [PATCH] late-combine: Honor targetm.cannot_copy_insn_p

2024-06-25 Thread Jeff Law




On 6/25/24 8:07 AM, Richard Sandiford wrote:

late-combine was failing to take targetm.cannot_copy_insn_p into
account, which led to multiple definitions of PIC symbols on
arm*-*-* targets.

Currently bootstrapping & regression testing on arm-linux-gnueabihf
and aarch64-linus-gnu.  It should fix the bootstrap-lto problem
reported by Linaro's CI.  OK to install if testing passes?

Richard


gcc/
* late-combine.cc (insn_combination::substitute_nondebug_use):
Reject second and subsequent uses if targetm.cannot_copy_insn_p
disallows copying.
---
  gcc/late-combine.cc | 12 
  1 file changed, 12 insertions(+)

diff --git a/gcc/late-combine.cc b/gcc/late-combine.cc
index fc75d1c56d7..b7c0bc07a8b 100644
--- a/gcc/late-combine.cc
+++ b/gcc/late-combine.cc
@@ -179,6 +179,18 @@ insn_combination::substitute_nondebug_use (use_info *use)
if (dump_file && (dump_flags & TDF_DETAILS))
  dump_insn_slim (dump_file, use->insn ()->rtl ());
  
+  // Reject second and subsequent uses if the target does not allow

+  // the defining instruction to be copied.
+  if (targetm.cannot_copy_insn_p
+  && m_nondebug_changes.length () >= 2
+  && targetm.cannot_copy_insn_p (m_def_insn->rtl ()))
+{
+  if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file, "-- The target does not allow multiple"
+" copies of insn %d\n", m_def_insn->uid ());
+  return false;
+}

OK
jeff



[PATCH, obvious] Fix PR c/115587, uninitialized variable in c_parser_omp_loop_nest

2024-06-25 Thread Sandra Loosemore
This function had a reference to an uninitialized variable on the
error path.  The problem was diagnosed by clang but not gcc.  It seems
the cleanest solution is to initialize all the loop-clause variables
at the point of declaration rather than at different places in the
code.

The C++ front end didn't have this problem, but I've made similar
changes there to keep the code in sync.

gcc/c/ChangeLog:

PR c/115587
* c-parser.cc (c_parser_omp_loop_nest): Move initializations to
point of declaration.

gcc/cp/ChangeLog:

PR c/115587
* parser.cc (cp_parser_omp_loop_nest): Move initializations to
point of declaration.
---
 gcc/c/c-parser.cc | 4 +---
 gcc/cp/parser.cc  | 8 ++--
 2 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index e83e9c683f7..33643ec910a 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -22430,7 +22430,7 @@ static tree c_parser_omp_unroll (location_t, c_parser 
*, bool *);
 static tree
 c_parser_omp_loop_nest (c_parser *parser, bool *if_p)
 {
-  tree decl, cond, incr, init;
+  tree decl = NULL_TREE, cond = NULL_TREE, incr = NULL_TREE, init = NULL_TREE;
   tree body = NULL_TREE;
   matching_parens parens;
   bool moreloops;
@@ -22619,7 +22619,6 @@ c_parser_omp_loop_nest (c_parser *parser, bool *if_p)
 }
 
   /* Parse the loop condition.  */
-  cond = NULL_TREE;
   if (c_parser_next_token_is_not (parser, CPP_SEMICOLON))
 {
   location_t cond_loc = c_parser_peek_token (parser)->location;
@@ -22652,7 +22651,6 @@ c_parser_omp_loop_nest (c_parser *parser, bool *if_p)
   c_parser_skip_until_found (parser, CPP_SEMICOLON, "expected %<;%>");
 
   /* Parse the increment expression.  */
-  incr = NULL_TREE;
   if (c_parser_next_token_is_not (parser, CPP_CLOSE_PAREN))
 {
   location_t incr_loc = c_parser_peek_token (parser)->location;
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index e7409b856f1..e5f16fe963d 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -45153,8 +45153,8 @@ static tree cp_parser_omp_tile (cp_parser *, cp_token 
*, bool *);
 static tree
 cp_parser_omp_loop_nest (cp_parser *parser, bool *if_p)
 {
-  tree decl, cond, incr, init;
-  tree orig_init, real_decl, orig_decl;
+  tree decl = NULL_TREE, cond = NULL_TREE, incr = NULL_TREE, init = NULL_TREE;
+  tree orig_init = NULL_TREE, real_decl = NULL_TREE, orig_decl = NULL_TREE;
   tree init_block, body_block;
   tree init_placeholder, body_placeholder;
   tree init_scope;
@@ -45324,8 +45324,6 @@ cp_parser_omp_loop_nest (cp_parser *parser, bool *if_p)
   if (!parens.require_open (parser))
 return NULL;
 
-  init = orig_init = decl = real_decl = orig_decl = NULL_TREE;
-
   init_placeholder = build_stmt (input_location, EXPR_STMT,
 integer_zero_node);
   vec_safe_push (omp_for_parse_state->init_placeholderv, init_placeholder);
@@ -45501,12 +45499,10 @@ cp_parser_omp_loop_nest (cp_parser *parser, bool 
*if_p)
}
 }
 
-  cond = NULL;
   if (cp_lexer_next_token_is_not (parser->lexer, CPP_SEMICOLON))
 cond = cp_parser_omp_for_cond (parser, decl, omp_for_parse_state->code);
   cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON);
 
-  incr = NULL;
   if (cp_lexer_next_token_is_not (parser->lexer, CPP_CLOSE_PAREN))
 {
   /* If decl is an iterator, preserve the operator on decl
-- 
2.25.1



Re: [PATCH] Hard register asm constraint

2024-06-25 Thread Maciej W. Rozycki
On Tue, 25 Jun 2024, Paul Koning wrote:

> >>> could be rewritten into
> >>> 
> >>> int test (int x, int y)
> >>> {
> >>>  asm ("foo %0,%1,%2" : "+{r4}" (x) : "{r5}" (y), "d" (y));
> >>>  return x;
> >>> }
> 
> I like this idea but I'm wondering: regular constraints specify what 
> sort of value is needed, for example an int vs. a short int vs. a float.  

 Isn't that inferred from the data type of the associated expression used, 
the types of `x' and `y' in this case?  Then the constraints only tell the 
middle end where that data comes from or goes to at the boundaries of an 
asm.

  Maciej


Re: [PATCH] c++: decltype of by-ref capture proxy of ref [PR115504]

2024-06-25 Thread Patrick Palka
On Mon, 24 Jun 2024, Jason Merrill wrote:

> On 6/24/24 21:00, Patrick Palka wrote:
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
> > for trunk/14?
> > 
> > -- >8 --
> > 
> > The capture proxy handling in finish_decltype_type added in r14-5330
> > was stripping the reference type of a capture proxy's captured variable,
> > which is desirable for a by-value capture, but not for a by-ref capture
> > (of a reference).
> 
> I'm not sure why we would want it for by-value, either; regardless of the
> capture kind, decltype(x) is int&.

Ah, makes sense.  But I guess that means

  void f(int& x) {
[x]() {
  decltype(auto) a = x;
}
  }

is ill-formed since decltype(x) is int& but the corresponding closure
member is const?  It works if we make the lambda mutable.

Like so?  Bootstrapped and regtested on x86_64-pc-linux-gnu.


-- >8 --

Subject: [PATCH] c++: decltype of capture proxy of ref [PR115504]

The capture proxy handling in finish_decltype_type added in r14-5330
was stripping the reference type of a capture proxy's captured variable.

PR c++/115504

gcc/cp/ChangeLog:

* semantics.cc (finish_decltype_type): Don't strip the reference
type (if any) of a capture proxy's captured variable.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/decltype-auto8.C: New test.
---
 gcc/cp/semantics.cc |  1 -
 gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C | 18 ++
 2 files changed, 18 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 6c1813d37c6..6a383c0f7f9 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12071,7 +12071,6 @@ finish_decltype_type (tree expr, bool 
id_expression_or_member_access_p,
{
  expr = DECL_CAPTURED_VARIABLE (expr);
  type = TREE_TYPE (expr);
- type = non_reference (type);
}
  else
{
diff --git a/gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C 
b/gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C
new file mode 100644
index 000..1100b94a5b7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C
@@ -0,0 +1,18 @@
+// PR c++/115504
+// { dg-do compile { target c++14 } }
+
+void f(int& x) {
+  [&x]() {
+decltype(auto) a = x;
+using type = decltype(x);
+using type = decltype(a);
+using type = int&; // not 'int'
+  };
+
+  [x]() mutable {
+decltype(auto) a = x;
+using type = decltype(x);
+using type = decltype(a);
+using type = int&;
+  };
+}
-- 
2.45.2.648.g1e1586e4ed

> 
> > PR c++/115504
> > 
> > gcc/cp/ChangeLog:
> > 
> > * semantics.cc (finish_decltype_type): For a by-reference
> > capture proxy, don't strip the reference type (if any) of
> > the captured variable.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/cpp1y/decltype-auto8.C: New test.
> > ---
> >   gcc/cp/semantics.cc |  4 +++-
> >   gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C | 11 +++
> >   2 files changed, 14 insertions(+), 1 deletion(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C
> > 
> > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> > index 08f5f245e7d..b4f626924af 100644
> > --- a/gcc/cp/semantics.cc
> > +++ b/gcc/cp/semantics.cc
> > @@ -12076,9 +12076,11 @@ finish_decltype_type (tree expr, bool
> > id_expression_or_member_access_p,
> > {
> >   if (is_normal_capture_proxy (expr))
> > {
> > + bool by_ref = TYPE_REF_P (TREE_TYPE (expr));
> >   expr = DECL_CAPTURED_VARIABLE (expr);
> >   type = TREE_TYPE (expr);
> > - type = non_reference (type);
> > + if (!by_ref)
> > +   type = non_reference (type);
> > }
> >   else
> > {
> > diff --git a/gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C
> > b/gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C
> > new file mode 100644
> > index 000..9a5e435f14f
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C
> > @@ -0,0 +1,11 @@
> > +// PR c++/115504
> > +// { dg-do compile { target c++14 } }
> > +
> > +void f(int& x) {
> > +  [&x]() {
> > +decltype(auto) a = x;
> > +using type = decltype(x);
> > +using type = decltype(a);
> > +using type = int&; // not 'int'
> > +  };
> > +}
> 
> 



Re: [PATCH] c++: alias CTAD and copy deduction guide [PR115198]

2024-06-25 Thread Patrick Palka
On Thu, 13 Jun 2024, Patrick Palka wrote:

> On Thu, 13 Jun 2024, Jason Merrill wrote:
> 
> > On 6/13/24 11:05, Patrick Palka wrote:
> > > On Thu, 23 May 2024, Jason Merrill wrote:
> > > 
> > > > On 5/23/24 17:42, Patrick Palka wrote:
> > > > > On Thu, 23 May 2024, Jason Merrill wrote:
> > > > > 
> > > > > > On 5/23/24 14:06, Patrick Palka wrote:
> > > > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
> > > > > > > OK for trunk/14?
> > > > > > > 
> > > > > > > -- >8 --
> > > > > > > 
> > > > > > > Here we're neglecting to update DECL_NAME during the alias CTAD
> > > > > > > guide
> > > > > > > transformation, which causes copy_guide_p to return false for the
> > > > > > > transformed copy deduction guide since DECL_NAME is still 
> > > > > > > __dguide_C
> > > > > > > with TREE_TYPE C but it should be __dguide_A with TREE_TYPE
> > > > > > > A
> > > > > > > (equivalently C).  This ultimately results in ambiguity
> > > > > > > during
> > > > > > > overload resolution between the copy deduction guide vs copy ctor
> > > > > > > guide.
> > > > > > > 
> > > > > > > This patch makes us update DECL_NAME of a transformed guide
> > > > > > > accordingly
> > > > > > > during alias CTAD.  This eventually needs to be done for inherited
> > > > > > > CTAD
> > > > > > > too, but it's not clear what identifier to use there since it has 
> > > > > > > to
> > > > > > > be
> > > > > > > unique for each derived/base pair.  For
> > > > > > > 
> > > > > > >  template struct A { ... };
> > > > > > >  template struct B : A { using A > > > > > > T>::A; }
> > > > > > > 
> > > > > > > at first glance it'd be reasonable to give inherited guides a name
> > > > > > > of
> > > > > > > __dguide_B with TREE_TYPE A, but since that name is
> > > > > > > already
> > > > > > > used B's own guides its TREE_TYPE is already B.
> > > > > > 
> > > > > > Why can't it be the same __dguide_B with TREE_TYPE B?
> > > > > 
> > > > > Ah because copy_guide_p relies on TREE_TYPE in order to recognize a 
> > > > > copy
> > > > > deduction guide, and with that TREE_TYPE it would still incorrectly
> > > > > return false for an inherited copy deduction guide, e.g.
> > > > > 
> > > > > A(A) -> A
> > > > > 
> > > > > gets transformed into
> > > > > 
> > > > > B(A) -> B
> > > > > 
> > > > > and A != B so copy_guide_p returns false.
> > > > 
> > > > Hmm, that seems correct; the transformed candidate is not the copy
> > > > deduction
> > > > guide for B.
> > > 
> > > By https://eel.is/c++draft/over.match.class.deduct#3.4 it seems that a
> > > class template can now have multiple copy deduction guides with inherited
> > > CTAD: the derived class's own copy guide, along with the transformed copy
> > > guides of its eligible base classes.  Do we want to follow the standard
> > > precisely here, or should we maybe restrict the copy-guideness propagation
> > > to alias CTAD only?
> > 
> > The latter, I think; it seems nonsensical to have multiple copy guides.
> 
> Sounds good, so for inherited CTAD it should suffice to use __dguide_B
> as the name (where B is the derived class), like so?

Ping.

> 
> -- >8 --
> 
> Subject: [PATCH] c++: alias CTAD and copy deduction guide [PR115198]
> 
> Here we're neglecting to update DECL_NAME during the alias CTAD guide
> transformation, which causes copy_guide_p to return false for the
> transformed copy deduction guide since DECL_NAME is still __dguide_C
> with TREE_TYPE C but it should be __dguide_A with TREE_TYPE A
> (equivalently C).  This ultimately results in ambiguity during
> overload resolution between the copy deduction guide vs copy ctor guide.
> 
> This patch makes us update DECL_NAME of a transformed guide accordingly
> during alias/inherited CTAD.
> 
>   PR c++/115198
> 
> gcc/cp/ChangeLog:
> 
>   * pt.cc (alias_ctad_tweaks): Update DECL_NAME of a transformed
>   guide.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/cpp2a/class-deduction-alias22.C: New test.
> ---
>  gcc/cp/pt.cc   |  6 +-
>  .../g++.dg/cpp2a/class-deduction-alias22.C | 14 ++
>  2 files changed, 19 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp2a/class-deduction-alias22.C
> 
> diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> index 607753ae6b7..daa8ac386dc 100644
> --- a/gcc/cp/pt.cc
> +++ b/gcc/cp/pt.cc
> @@ -30342,13 +30342,14 @@ alias_ctad_tweaks (tree tmpl, tree uguides)
>   any).  */
>  
>enum { alias, inherited } ctad_kind;
> -  tree atype, fullatparms, utype;
> +  tree atype, fullatparms, utype, name;
>if (TREE_CODE (tmpl) == TEMPLATE_DECL)
>  {
>ctad_kind = alias;
>atype = TREE_TYPE (tmpl);
>fullatparms = DECL_TEMPLATE_PARMS (tmpl);
>utype = DECL_ORIGINAL_TYPE (DECL_TEMPLATE_RESULT (tmpl));
> +  name = dguide_name (tmpl);
>  }
>else
>  {
> @@ -30356,6 +30357,8 @@ alias_ctad_tweaks (tree tmpl, tree uguides)
>atype = NULL_TREE;
>fulla

Re: [PATCH] arm: make arm_predict_doloop_p reject loops with calls

2024-06-25 Thread Richard Earnshaw (lists)
On 25/06/2024 12:53, Andre Vieira (lists) wrote:
> Hi,
> 
> With the introduction of low overhead loops in 
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=3dfc28dbbd21b1d708aa40064380ef4c42c994d7
>  we defined arm_predict_doloop_p, this is meant to be a low-weight check to 
> rule out loops we are not considering for doloop optimization and it is used 
> by other passes to prevent optimizations that may hurt the doloop 
> optimization later on. The reason these are meant to be lightweight is 
> because it's used by pre-RTL optimizations, meaning we can't do the same 
> checks that doloop does.
> 
> After the definition of arm_predict_doloop_p, when testing for 
> armv8.1-m.main, tree-ssa/ivopts-3.c failed the scan-dump check as the dump 
> now matched an extra '!= 0' introduced by:
> Doloop cmp iv use: if (ivtmp_1 != 0)
> Predict loop 1 can perform doloop optimization later.
> 
> where previously we had:
> Predict doloop failure due to target specific checks.
> 
> and after this patch:
> Predict doloop failure due to call in loop.
> Predict doloop failure due to target specific checks.
> 
> Added a copy of the original tree-ssa/ivopts-3.c as a target specifc test to 
> check for the new dump message.
> 
> Ran a regression test for arm-none-eabi with 
> -march=armv8.1-m.main+mve/-mfpu=auto/-mthumb/-mfloat-abi=hard.
> 
> OK for trunk?
> 
> gcc/ChangeLog:
> 
>     * confir/arm/arm.cc (arm_predict_doloop_p): Reject loops with 
> function calls that are not builtins.
> 
> gcc/testsuite/ChangeLog:
> 
>     * gcc.target/arm/mve/ivopts-3.c: New test.

OK.

R.


Re: [PATCH] c++: ICE with __has_unique_object_representations [PR115476]

2024-06-25 Thread Jason Merrill

On 6/25/24 07:15, Jonathan Wakely wrote:

On Tue, 25 Jun 2024 at 03:12, Jason Merrill  wrote:


On 6/18/24 10:31, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/14/13?


Makes sense to me, though probably the [meta.unary.prop] table should be
adjusted in the same way.  Jonathan, what do you think?


Just to make sure I understand correctly, the suggestion is to change
the precondition for the trait to something like:

"remove_all_extents_t shall be a complete type or cv void."

i.e. if T is incomplete then T[] cannot be used with the trait, right?


Yes.

Jason




-- >8 --
Here we started to ICE with r13-25: in check_trait_type, for "X[]" we
return true here:

if (kind == 1 && TREE_CODE (type) == ARRAY_TYPE && !TYPE_DOMAIN (type))
  return true; // Array of unknown bound. Don't care about completeness.

and then end up crashing in record_has_unique_obj_representations:

4836if (cur != wi::to_offset (sz))

because sz is null.

https://eel.is/c++draft/type.traits#tab:meta.unary.prop-row-47-column-3-sentence-1
says that the preconditions for __has_unique_object_representations are:
"T shall be a complete type, cv void, or an array of unknown bound" and
that "For an array type T, the same result as
has_unique_object_representations_v>" so T[]
should be treated as T.  So we should use kind==2 for the trait.

   PR c++/115476

gcc/cp/ChangeLog:

   * semantics.cc (finish_trait_expr)
   : Move below to call
   check_trait_type with kind==2.

gcc/testsuite/ChangeLog:

   * g++.dg/cpp1z/has-unique-obj-representations4.C: New test.
---
   gcc/cp/semantics.cc  |  2 +-
   .../cpp1z/has-unique-obj-representations4.C  | 16 
   2 files changed, 17 insertions(+), 1 deletion(-)
   create mode 100644 
gcc/testsuite/g++.dg/cpp1z/has-unique-obj-representations4.C

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 08f5f245e7d..42251b6764b 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12966,7 +12966,6 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
   case CPTK_HAS_NOTHROW_COPY:
   case CPTK_HAS_TRIVIAL_COPY:
   case CPTK_HAS_TRIVIAL_DESTRUCTOR:
-case CPTK_HAS_UNIQUE_OBJ_REPRESENTATIONS:
 if (!check_trait_type (type1))
   return error_mark_node;
 break;
@@ -12976,6 +12975,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
   case CPTK_IS_STD_LAYOUT:
   case CPTK_IS_TRIVIAL:
   case CPTK_IS_TRIVIALLY_COPYABLE:
+case CPTK_HAS_UNIQUE_OBJ_REPRESENTATIONS:
 if (!check_trait_type (type1, /* kind = */ 2))
   return error_mark_node;
 break;
diff --git a/gcc/testsuite/g++.dg/cpp1z/has-unique-obj-representations4.C 
b/gcc/testsuite/g++.dg/cpp1z/has-unique-obj-representations4.C
new file mode 100644
index 000..d6949dc7005
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/has-unique-obj-representations4.C
@@ -0,0 +1,16 @@
+// PR c++/115476
+// { dg-do compile { target c++11 } }
+
+struct X;
+static_assert(__has_unique_object_representations(X), ""); // { dg-error 
"invalid use of incomplete type" }
+static_assert(__has_unique_object_representations(X[]), "");  // { dg-error 
"invalid use of incomplete type" }
+static_assert(__has_unique_object_representations(X[1]), "");  // { dg-error 
"invalid use of incomplete type" }
+static_assert(__has_unique_object_representations(X[][1]), "");  // { dg-error 
"invalid use of incomplete type" }
+
+struct X {
+  int x;
+};
+static_assert(__has_unique_object_representations(X), "");
+static_assert(__has_unique_object_representations(X[]), "");
+static_assert(__has_unique_object_representations(X[1]), "");
+static_assert(__has_unique_object_representations(X[][1]), "");

base-commit: 7f9be55a4630134a237219af9cc8143e02080380








Re: [PATCH] c++: decltype of by-ref capture proxy of ref [PR115504]

2024-06-25 Thread Jason Merrill

On 6/25/24 11:03, Patrick Palka wrote:

On Mon, 24 Jun 2024, Jason Merrill wrote:


On 6/24/24 21:00, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
for trunk/14?

-- >8 --

The capture proxy handling in finish_decltype_type added in r14-5330
was stripping the reference type of a capture proxy's captured variable,
which is desirable for a by-value capture, but not for a by-ref capture
(of a reference).


I'm not sure why we would want it for by-value, either; regardless of the
capture kind, decltype(x) is int&.


Ah, makes sense.  But I guess that means

   void f(int& x) {
 [x]() {
   decltype(auto) a = x;
 }
   }

is ill-formed since decltype(x) is int& but the corresponding closure
member is const?  It works if we make the lambda mutable.


Yes, and clang agrees.  Let's also test that case.


Like so?  Bootstrapped and regtested on x86_64-pc-linux-gnu.


-- >8 --

Subject: [PATCH] c++: decltype of capture proxy of ref [PR115504]

The capture proxy handling in finish_decltype_type added in r14-5330
was stripping the reference type of a capture proxy's captured variable.

PR c++/115504

gcc/cp/ChangeLog:

* semantics.cc (finish_decltype_type): Don't strip the reference
type (if any) of a capture proxy's captured variable.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/decltype-auto8.C: New test.
---
  gcc/cp/semantics.cc |  1 -
  gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C | 18 ++
  2 files changed, 18 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 6c1813d37c6..6a383c0f7f9 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12071,7 +12071,6 @@ finish_decltype_type (tree expr, bool 
id_expression_or_member_access_p,
{
  expr = DECL_CAPTURED_VARIABLE (expr);
  type = TREE_TYPE (expr);
- type = non_reference (type);
}
  else
{
diff --git a/gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C 
b/gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C
new file mode 100644
index 000..1100b94a5b7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C
@@ -0,0 +1,18 @@
+// PR c++/115504
+// { dg-do compile { target c++14 } }
+
+void f(int& x) {
+  [&x]() {
+decltype(auto) a = x;
+using type = decltype(x);
+using type = decltype(a);
+using type = int&; // not 'int'
+  };
+
+  [x]() mutable {
+decltype(auto) a = x;
+using type = decltype(x);
+using type = decltype(a);
+using type = int&;
+  };
+}




Re: [PATCH] c++: alias CTAD and copy deduction guide [PR115198]

2024-06-25 Thread Jason Merrill

On 6/13/24 13:00, Patrick Palka wrote:

On Thu, 13 Jun 2024, Jason Merrill wrote:


On 6/13/24 11:05, Patrick Palka wrote:

On Thu, 23 May 2024, Jason Merrill wrote:


On 5/23/24 17:42, Patrick Palka wrote:

On Thu, 23 May 2024, Jason Merrill wrote:


On 5/23/24 14:06, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
OK for trunk/14?

-- >8 --

Here we're neglecting to update DECL_NAME during the alias CTAD
guide
transformation, which causes copy_guide_p to return false for the
transformed copy deduction guide since DECL_NAME is still __dguide_C
with TREE_TYPE C but it should be __dguide_A with TREE_TYPE
A
(equivalently C).  This ultimately results in ambiguity
during
overload resolution between the copy deduction guide vs copy ctor
guide.

This patch makes us update DECL_NAME of a transformed guide
accordingly
during alias CTAD.  This eventually needs to be done for inherited
CTAD
too, but it's not clear what identifier to use there since it has to
be
unique for each derived/base pair.  For

  template struct A { ... };
  template struct B : A { using A::A; }

at first glance it'd be reasonable to give inherited guides a name
of
__dguide_B with TREE_TYPE A, but since that name is
already
used B's own guides its TREE_TYPE is already B.


Why can't it be the same __dguide_B with TREE_TYPE B?


Ah because copy_guide_p relies on TREE_TYPE in order to recognize a copy
deduction guide, and with that TREE_TYPE it would still incorrectly
return false for an inherited copy deduction guide, e.g.

 A(A) -> A

gets transformed into

 B(A) -> B

and A != B so copy_guide_p returns false.


Hmm, that seems correct; the transformed candidate is not the copy
deduction
guide for B.


By https://eel.is/c++draft/over.match.class.deduct#3.4 it seems that a
class template can now have multiple copy deduction guides with inherited
CTAD: the derived class's own copy guide, along with the transformed copy
guides of its eligible base classes.  Do we want to follow the standard
precisely here, or should we maybe restrict the copy-guideness propagation
to alias CTAD only?


The latter, I think; it seems nonsensical to have multiple copy guides.


Sounds good, so for inherited CTAD it should suffice to use __dguide_B
as the name (where B is the derived class), like so?


OK.


-- >8 --

Subject: [PATCH] c++: alias CTAD and copy deduction guide [PR115198]

Here we're neglecting to update DECL_NAME during the alias CTAD guide
transformation, which causes copy_guide_p to return false for the
transformed copy deduction guide since DECL_NAME is still __dguide_C
with TREE_TYPE C but it should be __dguide_A with TREE_TYPE A
(equivalently C).  This ultimately results in ambiguity during
overload resolution between the copy deduction guide vs copy ctor guide.

This patch makes us update DECL_NAME of a transformed guide accordingly
during alias/inherited CTAD.

PR c++/115198

gcc/cp/ChangeLog:

* pt.cc (alias_ctad_tweaks): Update DECL_NAME of a transformed
guide.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/class-deduction-alias22.C: New test.
---
  gcc/cp/pt.cc   |  6 +-
  .../g++.dg/cpp2a/class-deduction-alias22.C | 14 ++
  2 files changed, 19 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/class-deduction-alias22.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 607753ae6b7..daa8ac386dc 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -30342,13 +30342,14 @@ alias_ctad_tweaks (tree tmpl, tree uguides)
   any).  */
  
enum { alias, inherited } ctad_kind;

-  tree atype, fullatparms, utype;
+  tree atype, fullatparms, utype, name;
if (TREE_CODE (tmpl) == TEMPLATE_DECL)
  {
ctad_kind = alias;
atype = TREE_TYPE (tmpl);
fullatparms = DECL_TEMPLATE_PARMS (tmpl);
utype = DECL_ORIGINAL_TYPE (DECL_TEMPLATE_RESULT (tmpl));
+  name = dguide_name (tmpl);
  }
else
  {
@@ -30356,6 +30357,8 @@ alias_ctad_tweaks (tree tmpl, tree uguides)
atype = NULL_TREE;
fullatparms = TREE_PURPOSE (tmpl);
utype = TREE_VALUE (tmpl);
+  name = dguide_name (TPARMS_PRIMARY_TEMPLATE
+ (INNERMOST_TEMPLATE_PARMS (fullatparms)));
  }
  
tsubst_flags_t complain = tf_warning_or_error;

@@ -30451,6 +30454,7 @@ alias_ctad_tweaks (tree tmpl, tree uguides)
}
  if (g == error_mark_node)
continue;
+ DECL_NAME (g) = name;
  if (nfparms == 0)
{
  /* The targs are all non-dependent, so g isn't a template.  */
diff --git a/gcc/testsuite/g++.dg/cpp2a/class-deduction-alias22.C 
b/gcc/testsuite/g++.dg/cpp2a/class-deduction-alias22.C
new file mode 100644
index 000..9c6c841166a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/class-deduction-alias22.C
@@ -0,0 +1,14 @@
+// PR c++/115198
+// { dg-do compile { target c++20 } }
+
+t

Re: [PATCH] c++: decltype of by-ref capture proxy of ref [PR115504]

2024-06-25 Thread Patrick Palka
On Tue, 25 Jun 2024, Jason Merrill wrote:

> On 6/25/24 11:03, Patrick Palka wrote:
> > On Mon, 24 Jun 2024, Jason Merrill wrote:
> > 
> > > On 6/24/24 21:00, Patrick Palka wrote:
> > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
> > > > for trunk/14?
> > > > 
> > > > -- >8 --
> > > > 
> > > > The capture proxy handling in finish_decltype_type added in r14-5330
> > > > was stripping the reference type of a capture proxy's captured variable,
> > > > which is desirable for a by-value capture, but not for a by-ref capture
> > > > (of a reference).
> > > 
> > > I'm not sure why we would want it for by-value, either; regardless of the
> > > capture kind, decltype(x) is int&.
> > 
> > Ah, makes sense.  But I guess that means
> > 
> >void f(int& x) {
> >  [x]() {
> >decltype(auto) a = x;
> >  }
> >}
> > 
> > is ill-formed since decltype(x) is int& but the corresponding closure
> > member is const?  It works if we make the lambda mutable.
> 
> Yes, and clang agrees.  Let's also test that case.

Like so?

-- >8 --

Subject: [PATCH] c++: decltype of capture proxy of ref [PR115504]

The capture proxy handling in finish_decltype_type added in r14-5330
was stripping the reference type of a capture proxy's captured variable.

PR c++/115504

gcc/cp/ChangeLog:

* semantics.cc (finish_decltype_type): Don't strip the reference
type (if any) of a capture proxy's captured variable.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/decltype-auto8.C: New test.
---
 gcc/cp/semantics.cc |  1 -
 gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C | 22 +
 2 files changed, 22 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 6c1813d37c6..6a383c0f7f9 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12071,7 +12071,6 @@ finish_decltype_type (tree expr, bool 
id_expression_or_member_access_p,
{
  expr = DECL_CAPTURED_VARIABLE (expr);
  type = TREE_TYPE (expr);
- type = non_reference (type);
}
  else
{
diff --git a/gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C 
b/gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C
new file mode 100644
index 000..55135cecf72
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C
@@ -0,0 +1,22 @@
+// PR c++/115504
+// { dg-do compile { target c++14 } }
+
+void f(int& x, const int& y) {
+  [&x]() {
+decltype(auto) a = x;
+using type = decltype(x);
+using type = decltype(a);
+using type = int&; // not 'int'
+  };
+
+  [x]() {
+decltype(auto) a = x; // { dg-error "discards qualifiers" }
+  };
+
+  [x]() mutable {
+decltype(auto) a = x;
+using type = decltype(x);
+using type = decltype(a);
+using type = int&;
+  };
+}
-- 
2.45.2.648.g1e1586e4ed

Subject: [PATCH] c++: decltype of capture proxy of ref [PR115504]

The capture proxy handling in finish_decltype_type added in r14-5330
was stripping the reference type of a capture proxy's captured variable.

PR c++/115504

gcc/cp/ChangeLog:

* semantics.cc (finish_decltype_type): Don't strip the reference
type (if any) of a capture proxy's captured variable.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/decltype-auto8.C: New test.
---
 gcc/cp/semantics.cc |  1 -
 gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C | 22 +
 2 files changed, 22 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 6c1813d37c6..6a383c0f7f9 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12071,7 +12071,6 @@ finish_decltype_type (tree expr, bool 
id_expression_or_member_access_p,
{
  expr = DECL_CAPTURED_VARIABLE (expr);
  type = TREE_TYPE (expr);
- type = non_reference (type);
}
  else
{
diff --git a/gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C 
b/gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C
new file mode 100644
index 000..55135cecf72
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C
@@ -0,0 +1,22 @@
+// PR c++/115504
+// { dg-do compile { target c++14 } }
+
+void f(int& x, const int& y) {
+  [&x]() {
+decltype(auto) a = x;
+using type = decltype(x);
+using type = decltype(a);
+using type = int&; // not 'int'
+  };
+
+  [x]() {
+decltype(auto) a = x; // { dg-error "discards qualifiers" }
+  };
+
+  [x]() mutable {
+decltype(auto) a = x;
+using type = decltype(x);
+using type = decltype(a);
+using type = int&;
+  };
+}
-- 
2.45.2.648.g1e1586e4ed

Subject: [PATCH] c++: decltype of capture proxy of ref [PR115504]

The capture proxy handling in finish_decltype_type added in r14-533

Re: [PATCH] Hard register asm constraint

2024-06-25 Thread Stefan Schulze Frielinghaus
On Tue, Jun 25, 2024 at 10:03:34AM -0400, Paul Koning wrote:
> 
> 
> > On Jun 24, 2024, at 1:50 AM, Stefan Schulze Frielinghaus 
> >  wrote:
> > 
> > Ping.
> > 
> > On Mon, Jun 10, 2024 at 07:19:19AM +0200, Stefan Schulze Frielinghaus wrote:
> >> Ping.
> >> 
> >> On Fri, May 24, 2024 at 11:13:12AM +0200, Stefan Schulze Frielinghaus 
> >> wrote:
> >>> This implements hard register constraints for inline asm.  A hard register
> >>> constraint is of the form {regname} where regname is any valid register.  
> >>> This
> >>> basically renders register asm superfluous.  For example, the snippet
> >>> 
> >>> int test (int x, int y)
> >>> {
> >>>  register int r4 asm ("r4") = x;
> >>>  register int r5 asm ("r5") = y;
> >>>  unsigned int copy = y;
> >>>  asm ("foo %0,%1,%2" : "+d" (r4) : "d" (r5), "d" (copy));
> >>>  return r4;
> >>> }
> >>> 
> >>> could be rewritten into
> >>> 
> >>> int test (int x, int y)
> >>> {
> >>>  asm ("foo %0,%1,%2" : "+{r4}" (x) : "{r5}" (y), "d" (y));
> >>>  return x;
> >>> }
> 
> I like this idea but I'm wondering: regular constraints specify what sort of 
> value is needed, for example an int vs. a short int vs. a float.  The 
> notation you've shown doesn't seem to have that aspect.

As Maciej already pointed out the type of the expression should suffice.
My assumption was that an asm can deal with a value as is or its
promoted value.  At least for integer values this should be fine and
AFAICS is also the case for simple constraints like "r" which do not
define any mode.  I've probably overseen something but which constraint
differentiates between int vs short?  However, you have a good point
with this and I should test this more.

> The other comment is that I didn't see documentation updates to reflect this 
> new feature.

I didn't came up with documentation yet since I was not sure whether
such a proposal would be accepted at all, i.e., just wanted to hear
whether you see some show stoppers or not.  Assuming this goes well I
guess it should be documented under simple constraints
https://gcc.gnu.org/onlinedocs/gcc/Simple-Constraints.html

Thanks,
Stefan


Re: [PATCH][c++ frontend]: check for missing condition for novector [PR115623]

2024-06-25 Thread Jason Merrill

On 6/25/24 04:01, Tamar Christina wrote:

Hi All,

It looks like I forgot to check in the C++ frontend if a condition exist for the
loop being adorned with novector.  This causes a segfault because cond isn't
expected to be null.

This fixes it by issuing the same kind of diagnostics we issue for the other
pragmas.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master? and backport to GCC-14?


Hmm, I'm not sure we want to error in this case; it's pointless, but 
indeed we aren't going to vectorize a loop that always loops.  I'd think 
we should treat it the same as an explicit "true" condition.  And 
perhaps the same for unroll/ivdep.


Does the C front-end treat the null condition different from a constant 
true condition?



Thanks,
Tamar

gcc/cp/ChangeLog:

PR c++/115623
* parser.cc (cp_parser_c_for): Add check for C++ cond.

gcc/testsuite/ChangeLog:

PR c++/115623
* g++.dg/vect/vect-novector-pragma_2.cc: New test.

---
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 
e7409b856f1127e303c6515a3bb2d61a10e7c378..24d7b0e4992fdff69951ac5955f304e473f53374
 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -14107,6 +14107,12 @@ cp_parser_c_for (cp_parser *parser, tree scope, tree 
init, bool ivdep,
   "% pragma");
condition = error_mark_node;
  }
+  else if (novector)
+{
+  cp_parser_error (parser, "missing loop condition in loop with "
+  "% pragma");
+  condition = error_mark_node;
+}
finish_for_cond (condition, stmt, ivdep, unroll, novector);
/* Look for the `;'.  */
cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON);
diff --git a/gcc/testsuite/g++.dg/vect/vect-novector-pragma_2.cc 
b/gcc/testsuite/g++.dg/vect/vect-novector-pragma_2.cc
new file mode 100644
index 
..05dba4db1c6544bc53cd05482d1b2e767052cf43
--- /dev/null
+++ b/gcc/testsuite/g++.dg/vect/vect-novector-pragma_2.cc
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+
+void f (char *a, int i)
+{
+#pragma GCC novector
+  for (;;i++)
+a[i] *= 2;
+}
+
+/* { dg-error "missing loop condition in loop with 'GCC novector' pragma before ';' 
token" "" { target *-*-* } 6 } */








Re: [PATCH] c++: structured bindings and lookup of tuple_size/tuple_element [PR115605]

2024-06-25 Thread Marek Polacek
On Mon, Jun 24, 2024 at 10:00:40PM -0700, Andrew Pinski wrote:
> The problem here is even though we pass std namespace to lookup_template_class
> as the context, it will look at the current scope for the name too.
> The fix is to lookup the qualified name first and then use that
> for lookup_template_class.
> This is how std::initializer_list is handled in listify.
> 
> Note g++.dg/cpp1z/decomp22.C testcase now fails correctly
> with an error, that tuple_size is not in the std namespace.
> I copied a fixed up testcase into g++.dg/cpp1z/decomp62.C.
> 
> Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> 
>   PR c++/115605
> 
> gcc/cp/ChangeLog:
> 
>   * decl.cc (get_tuple_size): Call lookup_qualified_name
>   before calling lookup_template_class.
>   (get_tuple_element_type): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/cpp1z/decomp22.C: Expect an error

Missing .

>   * g++.dg/cpp1z/decomp61.C: New test.
>   * g++.dg/cpp1z/decomp62.C: Copied from decomp22.C
>   and wrap tuple_size/tuple_element inside std namespace.
> 
> Signed-off-by: Andrew Pinski 
> ---
>  gcc/cp/decl.cc| 16 +---
>  gcc/testsuite/g++.dg/cpp1z/decomp22.C |  2 +-
>  gcc/testsuite/g++.dg/cpp1z/decomp61.C | 53 +++
>  gcc/testsuite/g++.dg/cpp1z/decomp62.C | 23 
>  4 files changed, 88 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp1z/decomp61.C
>  create mode 100644 gcc/testsuite/g++.dg/cpp1z/decomp62.C
> 
> diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
> index 03deb1493a4..81dde4d51a3 100644
> --- a/gcc/cp/decl.cc
> +++ b/gcc/cp/decl.cc
> @@ -9195,10 +9195,13 @@ get_tuple_size (tree type)
>  {
>tree args = make_tree_vec (1);
>TREE_VEC_ELT (args, 0) = type;
> -  tree inst = lookup_template_class (tuple_size_identifier, args,
> +  tree std_tuple_size = lookup_qualified_name (std_node, 
> tuple_size_identifier);
> +  if (std_tuple_size == error_mark_node)
> +return NULL_TREE;
> +  tree inst = lookup_template_class (std_tuple_size, args,
>/*in_decl*/NULL_TREE,
> -  /*context*/std_node,
> -  tf_none);
> +  /*context*/NULL_TREE,
> +  tf_warning_or_error);
>inst = complete_type (inst);
>if (inst == error_mark_node
>|| !COMPLETE_TYPE_P (inst)
> @@ -9224,9 +9227,12 @@ get_tuple_element_type (tree type, unsigned i)
>tree args = make_tree_vec (2);
>TREE_VEC_ELT (args, 0) = build_int_cst (integer_type_node, i);
>TREE_VEC_ELT (args, 1) = type;
> -  tree inst = lookup_template_class (tuple_element_identifier, args,
> +  tree std_tuple_elem = lookup_qualified_name (std_node, 
> tuple_element_identifier);

This line is too long.

> +  if (std_tuple_elem == error_mark_node)
> +return NULL_TREE;
> +  tree inst = lookup_template_class (std_tuple_elem, args,
>/*in_decl*/NULL_TREE,
> -  /*context*/std_node,
> +  /*context*/NULL_TREE,
>tf_warning_or_error);
>return make_typename_type (inst, type_identifier,
>none_type, tf_warning_or_error);
> diff --git a/gcc/testsuite/g++.dg/cpp1z/decomp22.C 
> b/gcc/testsuite/g++.dg/cpp1z/decomp22.C
> index 9e6b8df486a..4131486e292 100644
> --- a/gcc/testsuite/g++.dg/cpp1z/decomp22.C
> +++ b/gcc/testsuite/g++.dg/cpp1z/decomp22.C
> @@ -17,5 +17,5 @@ int
>  foo (C t)
>  {
>auto[x0] = t;  // { dg-warning "structured bindings only available 
> with" "" { target c++14_down } }
> -  return x0;
> +  return x0; /* { dg-error "cannot convert" } */
>  }
> diff --git a/gcc/testsuite/g++.dg/cpp1z/decomp61.C 
> b/gcc/testsuite/g++.dg/cpp1z/decomp61.C
> new file mode 100644
> index 000..874844b2c61
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp1z/decomp61.C
> @@ -0,0 +1,53 @@
> +// PR c++/115605
> +// { dg-do compile { target c++17 } }
> +// { dg-options "" }

(I don't think you need the empty dg-options here.)

The patch looks good to me otherwise, thanks.

Marek



Re: [PATCH][c++ frontend]: check for missing condition for novector [PR115623]

2024-06-25 Thread Tamar Christina
The 06/25/2024 17:10, Jason Merrill wrote:
> On 6/25/24 04:01, Tamar Christina wrote:
> > Hi All,
> > 
> > It looks like I forgot to check in the C++ frontend if a condition exist 
> > for the
> > loop being adorned with novector.  This causes a segfault because cond isn't
> > expected to be null.
> > 
> > This fixes it by issuing the same kind of diagnostics we issue for the other
> > pragmas.
> > 
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > 
> > Ok for master? and backport to GCC-14?
> 
> Hmm, I'm not sure we want to error in this case; it's pointless, but 
> indeed we aren't going to vectorize a loop that always loops.  I'd think 
> we should treat it the same as an explicit "true" condition.  And 
> perhaps the same for unroll/ivdep.
> 
> Does the C front-end treat the null condition different from a constant 
> true condition?
> 

No, in the C front-end we error for ivdep and unroll, but for novector we 
explicitly
suppress it by checking for novector && cond && cond != error_mark_node instead 
of
just novector && cond != error_mark_node in the use site.

Do you want to handle it that way to be consistent?

Cheers,
Tamar
> > Thanks,
> > Tamar
> > 
> > gcc/cp/ChangeLog:
> > 
> > PR c++/115623
> > * parser.cc (cp_parser_c_for): Add check for C++ cond.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > PR c++/115623
> > * g++.dg/vect/vect-novector-pragma_2.cc: New test.
> > 
> > ---
> > diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
> > index 
> > e7409b856f1127e303c6515a3bb2d61a10e7c378..24d7b0e4992fdff69951ac5955f304e473f53374
> >  100644
> > --- a/gcc/cp/parser.cc
> > +++ b/gcc/cp/parser.cc
> > @@ -14107,6 +14107,12 @@ cp_parser_c_for (cp_parser *parser, tree scope, 
> > tree init, bool ivdep,
> >"% pragma");
> > condition = error_mark_node;
> >   }
> > +  else if (novector)
> > +{
> > +  cp_parser_error (parser, "missing loop condition in loop with "
> > +  "% pragma");
> > +  condition = error_mark_node;
> > +}
> > finish_for_cond (condition, stmt, ivdep, unroll, novector);
> > /* Look for the `;'.  */
> > cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON);
> > diff --git a/gcc/testsuite/g++.dg/vect/vect-novector-pragma_2.cc 
> > b/gcc/testsuite/g++.dg/vect/vect-novector-pragma_2.cc
> > new file mode 100644
> > index 
> > ..05dba4db1c6544bc53cd05482d1b2e767052cf43
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/vect/vect-novector-pragma_2.cc
> > @@ -0,0 +1,10 @@
> > +/* { dg-do compile } */
> > +
> > +void f (char *a, int i)
> > +{
> > +#pragma GCC novector
> > +  for (;;i++)
> > +a[i] *= 2;
> > +}
> > +
> > +/* { dg-error "missing loop condition in loop with 'GCC novector' pragma 
> > before ';' token" "" { target *-*-* } 6 } */
> > 
> > 
> > 
> > 
> 

-- 


[Patch, rs6000, middle-end] v5: Add implementation for different targets for pair mem fusion

2024-06-25 Thread Ajit Agarwal
Hello All:

This patch addressed cleanup of the code and fix linaro failures.

All comments are addressed.

Common infrastructure using generic code for pair mem fusion of different
targets.

rs6000 target specific code implement virtual functions defined by generic code.

Target specific code are added in rs6000-mem-fusion.cc.

Bootstrapped and regtested on powerpc64-linux-gnu.

Thanks & Regards
Ajit


rs6000, middle-end: Add implementation for different targets for pair mem fusion

Common infrastructure using generic code for pair mem fusion of different
targets.

rs6000 target specific code implement virtual functions defined by generic code.

Target specific code are added in rs6000-mem-fusion.cc.

2024-06-25  Ajit Kumar Agarwal  

gcc/ChangeLog:

* config/rs6000/rs6000-passes.def: New mem fusion pass
before pass_early_remat.
* pair-fusion.h: Add additional pure virtual function
required for rs6000 target implementation.
* pair-fusion.cc: Use of virtual functions for additional
virtual function addded for rs6000 target.
* config/rs6000/rs6000-mem-fusion.cc: Add new pass.
Add target specific implementation for generic pure virtual
functions.
* config/rs6000/mma.md: Modify movoo machine description.
Add new machine description movoo1.
* config/rs6000/rs6000.cc: Modify rs6000_split_multireg_move
to expand movoo machine description for all constraints.
* config.gcc: Add new object file.
* config/rs6000/rs6000-protos.h: Add new prototype for mem
fusion pass.
* config/rs6000/t-rs6000: Add new rule.
* rtl-ssa/functions.h: Move out allocate function from private
to public and add get_m_temp_defs function.

gcc/testsuite/ChangeLog:

* g++.target/powerpc/mem-fusion.C: New test.
* g++.target/powerpc/mem-fusion-1.C: New test.
* gcc.target/powerpc/mma-builtin-1.c: Modify test.
---
 gcc/config.gcc|   2 +
 gcc/config/rs6000/mma.md  |  26 +-
 gcc/config/rs6000/rs6000-mem-fusion.cc| 724 ++
 gcc/config/rs6000/rs6000-passes.def   |   4 +-
 gcc/config/rs6000/rs6000-protos.h |   1 +
 gcc/config/rs6000/rs6000.cc   |  56 +-
 gcc/config/rs6000/rs6000.md   |   1 +
 gcc/config/rs6000/t-rs6000|   5 +
 gcc/pair-fusion.cc|  27 +-
 gcc/pair-fusion.h |  34 +
 gcc/rtl-ssa/functions.h   |  11 +-
 .../g++.target/powerpc/mem-fusion-1.C |  22 +
 gcc/testsuite/g++.target/powerpc/mem-fusion.C |  15 +
 .../gcc.target/powerpc/mma-builtin-1.c|   4 +-
 14 files changed, 905 insertions(+), 27 deletions(-)
 create mode 100644 gcc/config/rs6000/rs6000-mem-fusion.cc
 create mode 100644 gcc/testsuite/g++.target/powerpc/mem-fusion-1.C
 create mode 100644 gcc/testsuite/g++.target/powerpc/mem-fusion.C

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 644c456290d..a032723152f 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -524,6 +524,7 @@ powerpc*-*-*)
extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-logue.o"
extra_objs="${extra_objs} rs6000-call.o rs6000-pcrel-opt.o"
extra_objs="${extra_objs} rs6000-builtins.o rs6000-builtin.o"
+   extra_objs="${extra_objs} rs6000-mem-fusion.o"
extra_headers="ppc-asm.h altivec.h htmintrin.h htmxlintrin.h"
extra_headers="${extra_headers} bmi2intrin.h bmiintrin.h"
extra_headers="${extra_headers} xmmintrin.h mm_malloc.h emmintrin.h"
@@ -560,6 +561,7 @@ rs6000*-*-*)
extra_options="${extra_options} g.opt fused-madd.opt 
rs6000/rs6000-tables.opt"
extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-logue.o"
extra_objs="${extra_objs} rs6000-call.o rs6000-pcrel-opt.o"
+   extra_objs="${extra_objs} rs6000-mem-fusion.o"
target_gtfiles="$target_gtfiles 
\$(srcdir)/config/rs6000/rs6000-logue.cc 
\$(srcdir)/config/rs6000/rs6000-call.cc"
target_gtfiles="$target_gtfiles 
\$(srcdir)/config/rs6000/rs6000-pcrel-opt.cc"
;;
diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index 04e2d0066df..88413926a02 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -294,7 +294,31 @@
 
 (define_insn_and_split "*movoo"
   [(set (match_operand:OO 0 "nonimmediate_operand" "=wa,ZwO,wa")
-   (match_operand:OO 1 "input_operand" "ZwO,wa,wa"))]
+(match_operand:OO 1 "input_operand" "ZwO,wa,wa"))]
+  "TARGET_MMA
+   && (gpc_reg_operand (operands[0], OOmode)
+   || gpc_reg_operand (operands[1], OOmode))"
+;;""
+  "@
+   #
+   #
+   #"
+  "&& reload_completed"
+  [(const_int 0)]
+{
+  rs6000_split_multireg_move (operands[0], operands[1]);
+  DONE;
+}
+  [(set_attr "type" "vecload,vecstore,veclogical")
+   (set_attr "length" "*,*,8")])
+;;   (set_attr "max_prefixed_insns" "2,2,*")])
+
+
+(de

Re: [PATCH] Hard register asm constraint

2024-06-25 Thread Paul Koning



> On Jun 25, 2024, at 12:04 PM, Stefan Schulze Frielinghaus 
>  wrote:
> 
> On Tue, Jun 25, 2024 at 10:03:34AM -0400, Paul Koning wrote:
>> 
> ...
> could be rewritten into
> 
> int test (int x, int y)
> {
> asm ("foo %0,%1,%2" : "+{r4}" (x) : "{r5}" (y), "d" (y));
> return x;
> }
>> 
>> I like this idea but I'm wondering: regular constraints specify what sort of 
>> value is needed, for example an int vs. a short int vs. a float.  The 
>> notation you've shown doesn't seem to have that aspect.
> 
> As Maciej already pointed out the type of the expression should suffice.
> My assumption was that an asm can deal with a value as is or its
> promoted value.  At least for integer values this should be fine and
> AFAICS is also the case for simple constraints like "r" which do not
> define any mode.  I've probably overseen something but which constraint
> differentiates between int vs short?  However, you have a good point
> with this and I should test this more.

I thought there was but I may be confused.  On the other hand, there definitely 
are (machine dependent) constraints that distinguish, say, float from integer 
registers; pdp11 is an example.  If you were to use an "a" constraint, that 
means a floating point register and the compiler will detect attempts to pass 
non-float operands ("Inconsistent operand constraints...").

I see that the existing "register int ..." syntax appears to check that the 
register is the right type for the data type given for it, so for example on 
pdp11, 

register int ac1 asm ("ac1") = i;

fails ("register ... isn't suitable for data type").  I assume your new syntax 
would perform the same check and produce roughly the same error message.  You 
might verify that.  On pdp11, trying to use, for example, "r0" for a float, or 
"ac0" for an int, would produce that error.

With all that, I think your approach does work right and the question I raised 
isn't actually a problem.

>> The other comment is that I didn't see documentation updates to reflect this 
>> new feature.
> 
> I didn't came up with documentation yet since I was not sure whether
> such a proposal would be accepted at all, i.e., just wanted to hear
> whether you see some show stoppers or not.  Assuming this goes well I
> guess it should be documented under simple constraints
> https://gcc.gnu.org/onlinedocs/gcc/Simple-Constraints.html
> 
> Thanks,
> Stefan

That seems right.  I think it would be good for one of the global maintainers, 
or people of similar stature, to say whether Stefan's proposal is one that 
should be considered.  My 2c worth is that it should be.

The documentation might want to mention that the compiler will confirm that the 
register specified is suitable for the type given to it, just as it does for 
the old syntax.

paul



Re: [PATCH] Add a late-combine pass [PR106594]

2024-06-25 Thread YunQiang Su
Just FYI. This patch does something to gcc.target/mips/madd-8.c, and
gcc.target/mips/msub-8.c.

-PASS: gcc.target/mips/madd-8.c   -O2   scan-assembler \tmul\t
-PASS: gcc.target/mips/madd-8.c   -O2   scan-assembler-not \tmadd\t
-PASS: gcc.target/mips/madd-8.c   -O2   scan-assembler-not \tmflo\t
-PASS: gcc.target/mips/madd-8.c   -O2   scan-assembler-not \tmtlo\t
+FAIL: gcc.target/mips/madd-8.c   -O2   scan-assembler \tmul\t
+FAIL: gcc.target/mips/madd-8.c   -O2   scan-assembler-not \tmadd\t
+FAIL: gcc.target/mips/madd-8.c   -O2   scan-assembler-not \tmflo\t
+FAIL: gcc.target/mips/madd-8.c   -O2   scan-assembler-not \tmtlo\t

-FAIL: gcc.target/mips/msub-8.c   -O2   scan-assembler \tmul\t
-FAIL: gcc.target/mips/msub-8.c   -O2   scan-assembler-not \tmflo\t
-FAIL: gcc.target/mips/msub-8.c   -O2   scan-assembler-not \tmsub\t
-FAIL: gcc.target/mips/msub-8.c   -O2   scan-assembler-not \tmtlo\t
+PASS: gcc.target/mips/msub-8.c   -O2   scan-assembler \tmul\t
+PASS: gcc.target/mips/msub-8.c   -O2   scan-assembler-not \tmflo\t
+PASS: gcc.target/mips/msub-8.c   -O2   scan-assembler-not \tmsub\t
+PASS: gcc.target/mips/msub-8.c   -O2   scan-assembler-not \tmtlo\t

Quite interesting.  I will inverest the real reason.


Re: [PATCH] c++: ICE with __has_unique_object_representations [PR115476]

2024-06-25 Thread Jonathan Wakely
On Tue, 25 Jun 2024 at 16:17, Jason Merrill  wrote:
>
> On 6/25/24 07:15, Jonathan Wakely wrote:
> > On Tue, 25 Jun 2024 at 03:12, Jason Merrill  wrote:
> >>
> >> On 6/18/24 10:31, Marek Polacek wrote:
> >>> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/14/13?
> >>
> >> Makes sense to me, though probably the [meta.unary.prop] table should be
> >> adjusted in the same way.  Jonathan, what do you think?
> >
> > Just to make sure I understand correctly, the suggestion is to change
> > the precondition for the trait to something like:
> >
> > "remove_all_extents_t shall be a complete type or cv void."
> >
> > i.e. if T is incomplete then T[] cannot be used with the trait, right?
>
> Yes.

Thanks. This is now https://cplusplus.github.io/LWG/issue4113



[committed] Fix fr30-elf newlib build failure with late-combine

2024-06-25 Thread Jeff Law

So the late combine work has exposed a latent bug in the fr30 port.

The fr30 "call" instruction is pc-relative with a *very* limited range, 
12 bits to be precise.



With such a limited range its hard to see how we could ever consistently 
use it in the compiler, with the possible exception of self-recursion. 
Code generation seemed to be using indirect forms pretty consistently, 
though the RTL would allow direct calls.


With late-combine some of those indirects would be optimized into direct 
calls.  This naturally led to out of range scenarios.


With the fr30 port slated for removal unless it gets updated to use LRA 
and the fundamental problems using direct calls, I took the shortest 
path to keep things working -- namely forcing all calls to be indirect.


Tested in my tester with no regressions (and fixes the newlib build 
failure with late-combine enabled).Pushed to the trunk.


Jeff


commit 7c28228cda274484b78611ea4c5cbe4ce08f512e
Author: Jeff Law 
Date:   Tue Jun 25 11:22:01 2024 -0600

[committed] Fix fr30-elf newlib build failure with late-combine

So the late combine work has exposed a latent bug in the fr30 port.

The fr30 "call" instruction is pc-relative with a *very* limited range, 12 
bits
to be precise.

With such a limited range its hard to see how we could ever consistently 
use it
in the compiler, with the possible exception of self-recursion.  Even for a
call to a locally binding function -ffunction-sections and linker placement 
of
functions may separate the caller/callee.  Code generation seemed to be 
using
indirect forms pretty consistently, though the RTL would allow direct calls.

With late-combine some of those indirects would be optimized into direct 
calls.
This naturally led to out of range scenarios.

With the fr30 port slated for removal unless it gets updated to use LRA and 
the
fundamental problems using direct calls, I took the shortest path to keep
things working -- namely forcing all calls to be indirect.

Tested in my tester with no regressions (and fixes the newlib build failure
with late-combine enabled).Pushed to the trunk.

gcc/
* config/fr30/constraints.md (Q): Remove unused constraint.
* config/fr30/predicates.md (call_operand): Remove unused predicate.
* config/fr30/fr30.md (call, vall_value): Turn into expanders and
force the call address into a register.
(*call, *call_value): Adjust to only allow indirect calls.  Adjust
output template accordingly.

diff --git a/gcc/config/fr30/constraints.md b/gcc/config/fr30/constraints.md
index e4e2be1bfd9..1beee7cc3a2 100644
--- a/gcc/config/fr30/constraints.md
+++ b/gcc/config/fr30/constraints.md
@@ -63,9 +63,3 @@ (define_constraint "P"
   "An integer in the range -256 to 255."
   (and (match_code "const_int")
(match_test "IN_RANGE (ival, -256, 255)")))
-
-;; Extra constraints.
-(define_constraint "Q"
-  "@internal"
-  (and (match_code "mem")
-   (match_code "symbol_ref" "0")))
diff --git a/gcc/config/fr30/fr30.md b/gcc/config/fr30/fr30.md
index ecde60b455d..04f6d909054 100644
--- a/gcc/config/fr30/fr30.md
+++ b/gcc/config/fr30/fr30.md
@@ -1079,12 +1079,19 @@ (define_insn "*branch_false"
 ;; `SImode', except it is normally a `const_int'); operand 2 is the number of
 ;; registers used as operands.
 
-(define_insn "call"
-  [(call (match_operand 0 "call_operand" "Qm")
+(define_expand "call"
+  [(parallel [(call (mem:QI (match_operand:SI  0 "register_operand" "r"))
+(match_operand 1 "" "g"))
+  (clobber (reg:SI 17))])]
+  ""
+  " { operands[0] = force_reg (SImode, XEXP (operands[0], 0)); } ")
+
+(define_insn "*call"
+  [(call (mem:QI (match_operand:SI 0 "register_operand" "r"))
 (match_operand 1 "" "g"))
(clobber (reg:SI 17))]
   ""
-  "call%#\\t%0"
+  "call%#\\t@%0"
   [(set_attr "delay_type" "delayed")]
 )
 
@@ -1094,14 +1101,21 @@ (define_insn "call"
 ;; increased by one).
 
 ;; Subroutines that return `BLKmode' objects use the `call' insn.
+(define_expand "call_value"
+  [(parallel [(set (match_operand 0 "register_operand"  "=r")
+   (call (mem:QI (match_operand:SI 1 "register_operand" "r"))
+ (match_operand 2 "" "g")))
+  (clobber (reg:SI 17))])]
+  ""
+  " { operands[1] = force_reg (SImode, XEXP (operands[1], 0)); } ")
 
-(define_insn "call_value"
+(define_insn "*call_value"
   [(set (match_operand 0 "register_operand"  "=r")
-   (call (match_operand 1 "call_operand" "Qm")
+   (call (mem:QI (match_operand:SI 1 "register_operand" "r"))
  (match_operand 2 "" "g")))
(clobber (reg:SI 17))]
   ""
-  "call%#\\t%1"
+  "call%#\\t@%1"
   [(set_attr "delay_type" "delayed")]
 )
 
diff --git a/gcc/config/fr30/predicates.md b/gcc/config/fr30/predicates.md
index f67c6097430..

Re: [PATCH v3] [testsuite] [arm] [vect] adjust mve-vshr test [PR113281]

2024-06-25 Thread Richard Sandiford
Alexandre Oliva  writes:
> On Jun 24, 2024, "Richard Earnshaw (lists)"  wrote:
>
>> A signed shift right on a 16-bit vector element by 15 would still
>> yield -1
>
> Yeah.  Indeed, ISTM that we *could* have retained the clamping
> transformation for *signed* shifts, since the clamping would only make a
> difference in case of (undefined) overflow.  Only for unsigned shifts
> can well-defined shifts yield different results with clamping.
>
> Richard (Sandiford), do you happen to recall why the IRC conversation
> mentioned in the PR trail decided to drop it entirely, even for signed
> types?

In the PR, the original shift was 32768 >> x (x >= 16) on ints, which the
vectoriser was narrowing to 32768 >> x' on shorts.  The original shift is
well-defined for both signed and unsigned shifts, and no valid x' exists
for that case.

Thanks,
Richard



Re: [PATCH] Add -finline-functions-aggressive option [PR114531]

2024-06-25 Thread Malladi, Rama
Thanks for the review and the inputs, Richard Biener. The `-finline-as=` option 
is an interesting.

However, this PR specifically aims to make these `-O3` inline params to be 
available under some `-f` option, similar to some of the existing inline 
options.

On 6/24/24, 6:28 AM, "Richard Biener" mailto:richard.guent...@gmail.com>> wrote:


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.






On Mon, Jun 24, 2024 at 1:18 PM Malladi, Rama mailto:rvmal...@amazon.com>> wrote:
>
> From: Rama Malladi mailto:rvmal...@amazon.com>>


Hmm, if we offer the ability to set -O3 inline limits why wouldn't we
offer a way to set -O2 inline limits for example with -O3? So ... wouldn't
a -finline-limit={default,O2,O3} option be a more generic and
extensible way to achieve what the patch does?


Yeah, it conflicts somewhat with the existing -finline-limit[-=] flags,
so possibly another name (-finline-as=O3?) is needed.


Richard.


> Signed-off-by: Rama Malladi mailto:rvmal...@amazon.com>>
> ---
> gcc/common.opt | 5 +
> gcc/doc/invoke.texi | 18 +-
> gcc/opts.cc | 17 -
> 3 files changed, 30 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/common.opt b/gcc/common.opt
> index f2bc47fdc5e..ce95175c1e4 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1961,6 +1961,11 @@ finline-functions-called-once
> Common Var(flag_inline_functions_called_once) Optimization
> Integrate functions only required by their single caller.
>
> +finline-functions-aggressive
> +Common Var(flag_inline_functions_aggressive) Init(0) Optimization
> +Aggressively integrate functions not declared \"inline\" into their callers 
> when profitable.
> +This option selects the same inlining heuristics as \"-O3\".
> +
> finline-limit-
> Common RejectNegative Joined Alias(finline-limit=)
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index c790e2f3518..7dc5c5ab433 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -570,8 +570,8 @@ Objective-C and Objective-C++ Dialects}.
> -fgcse-sm -fhoist-adjacent-loads -fif-conversion
> -fif-conversion2 -findirect-inlining
> -finline-stringops[=@var{fn}]
> --finline-functions -finline-functions-called-once -finline-limit=@var{n}
> --finline-small-functions -fipa-modref -fipa-cp -fipa-cp-clone
> +-finline-functions -finline-functions-aggressive 
> -finline-functions-called-once
> +-finline-limit=@var{n} -finline-small-functions -fipa-modref -fipa-cp 
> -fipa-cp-clone
> -fipa-bit-cp -fipa-vrp -fipa-pta -fipa-profile -fipa-pure-const
> -fipa-reference -fipa-reference-addressable
> -fipa-stack-alignment -fipa-icf -fira-algorithm=@var{algorithm}
> @@ -12625,9 +12625,9 @@ designed to reduce code size.
> Disregard strict standards compliance. @option{-Ofast} enables all
> @option{-O3} optimizations. It also enables optimizations that are not
> valid for all standard-compliant programs.
> -It turns on @option{-ffast-math}, @option{-fallow-store-data-races}
> -and the Fortran-specific @option{-fstack-arrays}, unless
> -@option{-fmax-stack-var-size} is specified, and @option{-fno-protect-parens}.
> +It turns on @option{-ffast-math}, @option{-finline-functions-aggressive},
> +@option{-fallow-store-data-races} and the Fortran-specific 
> @option{-fstack-arrays},
> +unless @option{-fmax-stack-var-size} is specified, and 
> @option{-fno-protect-parens}.
> It turns off @option{-fsemantic-interposition}.
>
> @opindex Og
> @@ -12793,6 +12793,14 @@ assembler code in its own right.
> Enabled at levels @option{-O2}, @option{-O3}, @option{-Os}. Also enabled
> by @option{-fprofile-use} and @option{-fauto-profile}.
>
> +@opindex finline-functions-aggressive
> +@item -finline-functions-aggressive
> +Aggressively integrate functions not declared @code{inline} into their 
> callers when
> +profitable. This option selects the same inlining heuristics as @option{-O3}.
> +
> +Enabled at levels @option{-O3}, @option{-Ofast}, but not @option{-Og},
> +@option{-O1}, @option{-O2}, @option{-Os}.
> +
> @opindex finline-functions-called-once
> @item -finline-functions-called-once
> Consider all @code{static} functions called once for inlining into their
> diff --git a/gcc/opts.cc b/gcc/opts.cc
> index 1b1b46455af..729f2831e67 100644
> --- a/gcc/opts.cc
> +++ b/gcc/opts.cc
> @@ -700,11 +700,7 @@ static const struct default_options 
> default_options_table[] =
> { OPT_LEVELS_3_PLUS, OPT_fversion_loops_for_strides, NULL, 1 },
>
> /* -O3 parameters. */
> - { OPT_LEVELS_3_PLUS, OPT__param_max_inline_insns_auto_, NULL, 30 },
> - { OPT_LEVELS_3_PLUS, OPT__param_early_inlining_insns_, NULL, 14 },
> - { OPT_LEVELS_3_PLUS, OPT__param_inline_heuristics_hint_percent_, NULL, 600 
> },
> - { OPT_LEVELS_3_PLUS, OPT__param_inline_min_speedup_, NULL, 15 },
> - { OPT_LEVELS_3_PLUS, OPT__param_max_inline_insns_single_, NULL, 200 },
> + { OPT_LEVELS_3_PLUS, OPT_finline_functions_aggres

Re: [PATCH] c++: ICE with __has_unique_object_representations [PR115476]

2024-06-25 Thread Marek Polacek
On Tue, Jun 25, 2024 at 06:22:56PM +0100, Jonathan Wakely wrote:
> On Tue, 25 Jun 2024 at 16:17, Jason Merrill  wrote:
> >
> > On 6/25/24 07:15, Jonathan Wakely wrote:
> > > On Tue, 25 Jun 2024 at 03:12, Jason Merrill  wrote:
> > >>
> > >> On 6/18/24 10:31, Marek Polacek wrote:
> > >>> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/14/13?
> > >>
> > >> Makes sense to me, though probably the [meta.unary.prop] table should be
> > >> adjusted in the same way.  Jonathan, what do you think?
> > >
> > > Just to make sure I understand correctly, the suggestion is to change
> > > the precondition for the trait to something like:
> > >
> > > "remove_all_extents_t shall be a complete type or cv void."
> > >
> > > i.e. if T is incomplete then T[] cannot be used with the trait, right?
> >
> > Yes.
> 
> Thanks. This is now https://cplusplus.github.io/LWG/issue4113

Looks good.  Should I push my patch now? 

Marek



[COMMITTED] Add param for bb limit to invoke fast_vrp.

2024-06-25 Thread Andrew MacLeod


On 6/25/24 00:27, Andrew Pinski wrote:

On Mon, Jun 24, 2024 at 7:35 PM Andrew Pinski  wrote:



This should be:
warning (OPT_Wdisabled_optimization, "Using fast VRP algorithm. %d basic blocks"
 " exceeds %<%--param=vrp-block-limit=d%> limit",
n_basic_blocks_for_fn (fun), param_vrp_block_limit);

I had thought it was mentioned that options should be quoted but it is
not mentioned in the coding conventions:
https://gcc.gnu.org/codingconventions.html#Diagnostics

But it is mentioned in
https://inbox.sourceware.org/gcc/2d2bd844-2de4-ecff-7a07-b22350750...@gmail.com/
; This is why you were getting an error as you mentioned on IRC.

Note I filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115627 to
add it to the coding conventions. I might take a stab at adding the
rules mentioned in
https://inbox.sourceware.org/gcc-patches/8ac62fe2-e4bf-0922-4947-fca9567a0...@gmail.com/
since those are the ones which are detected right now.

Thanks,
Andrew


Thanks.  Final version checked in:

Andrew
From 1ea95cc5e099d554764b82df8e972129e9d20885 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Mon, 17 Jun 2024 11:38:46 -0400
Subject: [PATCH] Add param for bb limit to invoke fast_vrp.

If the basic block count is too high, simply use fast_vrp for all
VRP passes.

	* doc/invoke.texi (vrp-block-limit): Document.
	* params.opt (param=vrp-block-limit): New.
	* tree-vrp.cc (fvrp_folder::execute): Invoke fast_vrp if block
	count exceeds limit.
---
 gcc/doc/invoke.texi |  3 +++
 gcc/params.opt  |  4 
 gcc/tree-vrp.cc | 14 --
 3 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 23d90db2925..729dbc1691e 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -16849,6 +16849,9 @@ this parameter.  The default value of this parameter is 50.
 @item vect-induction-float
 Enable loop vectorization of floating point inductions.
 
+@item vrp-block-limit
+Maximum number of basic blocks before VRP switches to a lower memory algorithm.
+
 @item vrp-sparse-threshold
 Maximum number of basic blocks before VRP uses a sparse bitmap cache.
 
diff --git a/gcc/params.opt b/gcc/params.opt
index d34ef545bf0..c17ba17b91b 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -1198,6 +1198,10 @@ The maximum factor which the loop vectorizer applies to the cost of statements i
 Common Joined UInteger Var(param_vect_induction_float) Init(1) IntegerRange(0, 1) Param Optimization
 Enable loop vectorization of floating point inductions.
 
+-param=vrp-block-limit=
+Common Joined UInteger Var(param_vrp_block_limit) Init(15) Optimization Param
+Maximum number of basic blocks before VRP switches to a fast model with less memory requirements.
+
 -param=vrp-sparse-threshold=
 Common Joined UInteger Var(param_vrp_sparse_threshold) Init(3000) Optimization Param
 Maximum number of basic blocks before VRP uses a sparse bitmap cache.
diff --git a/gcc/tree-vrp.cc b/gcc/tree-vrp.cc
index 26979b706e5..e184e9af51e 100644
--- a/gcc/tree-vrp.cc
+++ b/gcc/tree-vrp.cc
@@ -60,6 +60,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "ipa-cp.h"
 #include "ipa-prop.h"
 #include "attribs.h"
+#include "diagnostic-core.h"
 
 // This class is utilized by VRP and ranger to remove __builtin_unreachable
 // calls, and reflect any resulting global ranges.
@@ -1331,9 +1332,18 @@ public:
   unsigned int execute (function *fun) final override
 {
   // Check for fast vrp.
-  if (&data == &pass_data_fast_vrp)
+  bool use_fvrp = (&data == &pass_data_fast_vrp);
+  if (!use_fvrp && last_basic_block_for_fn (fun) > param_vrp_block_limit)
+	{
+	  use_fvrp = true;
+	  warning (OPT_Wdisabled_optimization,
+		   "Using fast VRP algorithm. %d basic blocks"
+		   " exceeds %<--param=vrp-block-limit=%d%> limit",
+		   n_basic_blocks_for_fn (fun),
+		   param_vrp_block_limit);
+	}
+  if (use_fvrp)
 	return execute_fast_vrp (fun, final_p);
-
   return execute_ranger_vrp (fun, final_p);
 }
 
-- 
2.45.0



[PATCH 0/3] RISC-V: AMO testsuite cleanup

2024-06-25 Thread Patrick O'Neill
This is another round of AMO testcase cleanup. Consolidates a lot of testcases
and unifies the testcase names.

Patrick O'Neill (3):
  RISC-V: Rename amo testcases
  RISC-V: Consolidate amo testcase variants
  RISC-V: Update testcase comments to point to PSABI rather than Table
A.6

 .../gcc.target/riscv/amo/a-rvwmo-fence.c  | 56 +
 ...le-a-6-load-2.c => a-rvwmo-load-acquire.c} |  2 +-
 ...le-a-6-load-1.c => a-rvwmo-load-relaxed.c} |  2 +-
 ...le-a-6-load-3.c => a-rvwmo-load-seq-cst.c} |  2 +-
 ...pat-3.c => a-rvwmo-store-compat-seq-cst.c} |  3 +-
 ...-a-6-store-1.c => a-rvwmo-store-relaxed.c} |  2 +-
 ...-a-6-store-2.c => a-rvwmo-store-release.c} |  2 +-
 .../gcc.target/riscv/amo/a-ztso-fence.c   | 52 +
 ...le-ztso-load-2.c => a-ztso-load-acquire.c} |  2 +-
 ...le-ztso-load-1.c => a-ztso-load-relaxed.c} |  2 +-
 ...le-ztso-load-3.c => a-ztso-load-seq-cst.c} |  2 +-
 ...tore-3.c => a-ztso-store-compat-seq-cst.c} |  3 +-
 ...-ztso-store-1.c => a-ztso-store-relaxed.c} |  2 +-
 ...-ztso-store-2.c => a-ztso-store-release.c} |  2 +-
 .../riscv/amo/amo-table-a-6-amo-add-1.c   | 17 
 .../riscv/amo/amo-table-a-6-amo-add-2.c   | 17 
 .../riscv/amo/amo-table-a-6-amo-add-3.c   | 17 
 .../riscv/amo/amo-table-a-6-amo-add-4.c   | 17 
 .../riscv/amo/amo-table-a-6-amo-add-5.c   | 17 
 .../riscv/amo/amo-table-a-6-fence-1.c | 15 
 .../riscv/amo/amo-table-a-6-fence-2.c | 16 
 .../riscv/amo/amo-table-a-6-fence-3.c | 16 
 .../riscv/amo/amo-table-a-6-fence-4.c | 16 
 .../riscv/amo/amo-table-a-6-fence-5.c | 16 
 .../riscv/amo/amo-table-ztso-amo-add-1.c  | 17 
 .../riscv/amo/amo-table-ztso-amo-add-2.c  | 17 
 .../riscv/amo/amo-table-ztso-amo-add-3.c  | 17 
 .../riscv/amo/amo-table-ztso-amo-add-4.c  | 17 
 .../riscv/amo/amo-table-ztso-amo-add-5.c  | 17 
 .../riscv/amo/amo-table-ztso-fence-1.c| 15 
 .../riscv/amo/amo-table-ztso-fence-2.c| 15 
 .../riscv/amo/amo-table-ztso-fence-3.c| 15 
 .../riscv/amo/amo-table-ztso-fence-4.c| 15 
 .../riscv/amo/amo-table-ztso-fence-5.c| 16 
 .../riscv/amo/amo-zalrsc-amo-add-1.c  | 22 --
 .../riscv/amo/amo-zalrsc-amo-add-2.c  | 22 --
 .../riscv/amo/amo-zalrsc-amo-add-3.c  | 22 --
 .../riscv/amo/amo-zalrsc-amo-add-4.c  | 22 --
 .../riscv/amo/amo-zalrsc-amo-add-5.c  | 22 --
 ...zalrsc.c => zaamo-preferred-over-zalrsc.c} |  0
 .../riscv/amo/zaamo-rvwmo-amo-add-int.c   | 57 ++
 .../riscv/amo/zaamo-ztso-amo-add-int.c| 57 ++
 .../riscv/amo/zalrsc-rvwmo-amo-add-int.c  | 78 +++
 ...mo-compare-exchange-int-acquire-release.c} |  2 +-
 ...lrsc-rvwmo-compare-exchange-int-acquire.c} |  2 +-
 ...lrsc-rvwmo-compare-exchange-int-consume.c} |  2 +-
 ...lrsc-rvwmo-compare-exchange-int-relaxed.c} |  2 +-
 ...lrsc-rvwmo-compare-exchange-int-release.c} |  2 +-
 ...mo-compare-exchange-int-seq-cst-relaxed.c} |  3 +-
 ...lrsc-rvwmo-compare-exchange-int-seq-cst.c} |  2 +-
 ...lrsc-rvwmo-subword-amo-add-char-acq-rel.c} |  2 +-
 ...lrsc-rvwmo-subword-amo-add-char-acquire.c} |  2 +-
 ...lrsc-rvwmo-subword-amo-add-char-relaxed.c} |  2 +-
 ...lrsc-rvwmo-subword-amo-add-char-release.c} |  2 +-
 ...lrsc-rvwmo-subword-amo-add-char-seq-cst.c} |  2 +-
 .../riscv/amo/zalrsc-ztso-amo-add-int.c   | 78 +++
 ...so-compare-exchange-int-acquire-release.c} |  3 +-
 ...alrsc-ztso-compare-exchange-int-acquire.c} |  2 +-
 ...alrsc-ztso-compare-exchange-int-consume.c} |  2 +-
 ...alrsc-ztso-compare-exchange-int-relaxed.c} |  2 +-
 ...alrsc-ztso-compare-exchange-int-release.c} |  2 +-
 ...so-compare-exchange-int-seq-cst-relaxed.c} |  3 +-
 ...alrsc-ztso-compare-exchange-int-seq-cst.c} |  2 +-
 ...alrsc-ztso-subword-amo-add-char-acq-rel.c} |  2 +-
 ...alrsc-ztso-subword-amo-add-char-acquire.c} |  2 +-
 ...alrsc-ztso-subword-amo-add-char-relaxed.c} |  2 +-
 ...alrsc-ztso-subword-amo-add-char-release.c} |  2 +-
 ...alrsc-ztso-subword-amo-add-char-seq-cst.c} |  2 +-
 68 files changed, 419 insertions(+), 471 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo/a-rvwmo-fence.c
 rename gcc/testsuite/gcc.target/riscv/amo/{amo-table-a-6-load-2.c => 
a-rvwmo-load-acquire.c} (94%)
 rename gcc/testsuite/gcc.target/riscv/amo/{amo-table-a-6-load-1.c => 
a-rvwmo-load-relaxed.c} (94%)
 rename gcc/testsuite/gcc.target/riscv/amo/{amo-table-a-6-load-3.c => 
a-rvwmo-load-seq-cst.c} (94%)
 rename gcc/testsuite/gcc.target/riscv/amo/{amo-table-a-6-store-compat-3.c => 
a-rvwmo-store-compat-seq-cst.c} (93%)
 rename gcc/testsuite/gcc.target/riscv/amo/{amo-table-a-6-store-1.c => 
a-rvwmo-store-relaxed.c} (94%)
 rename gcc/testsuite/gcc.target/riscv/amo/{amo-table-a-6-store-2.c => 
a-rvwmo-store-release.c} (94%)
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo/a-ztso-fence.c
 rename gcc/

[PATCH 1/3] RISC-V: Rename amo testcases

2024-06-25 Thread Patrick O'Neill
Rename riscv/amo/ testcases to follow a '{ext}-{model}-{name}-{memory order}.c'
naming convention.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/amo/amo-table-a-6-load-2.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-load-acquire.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-load-1.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-load-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-load-3.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-load-seq-cst.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-store-compat-3.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-store-compat-seq-cst.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-store-1.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-store-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-store-2.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-store-release.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-load-2.c: Move to...
* gcc.target/riscv/amo/a-ztso-load-acquire.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-load-1.c: Move to...
* gcc.target/riscv/amo/a-ztso-load-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-load-3.c: Move to...
* gcc.target/riscv/amo/a-ztso-load-seq-cst.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-store-3.c: Move to...
* gcc.target/riscv/amo/a-ztso-store-compat-seq-cst.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-store-1.c: Move to...
* gcc.target/riscv/amo/a-ztso-store-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-store-2.c: Move to...
* gcc.target/riscv/amo/a-ztso-store-release.c: ...here.
* gcc.target/riscv/amo/amo-zaamo-preferred-over-zalrsc.c: Move to...
* gcc.target/riscv/amo/zaamo-preferred-over-zalrsc.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-6.c: Move to...
* 
gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire-release.c: 
...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-3.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire.c: 
...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-2.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-consume.c: 
...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-1.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-relaxed.c: 
...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-4.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-release.c: 
...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-7.c: Move to...
* 
gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst-relaxed.c: 
...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-5.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst.c: 
...here.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-4.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acq-rel.c: 
...here.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-2.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acquire.c: 
...here.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-1.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-relaxed.c: 
...here.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-3.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-release.c: 
...here.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-5.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-seq-cst.c: 
...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-6.c: Move to...
* 
gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire-release.c: 
...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-3.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire.c: 
...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-2.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-consume.c: 
...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-1.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-relaxed.c: 
...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-4.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-release.c: 
...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-7.c: Move to...
* 
gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst-relaxed.c: 
...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-5.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cs

[PATCH 2/3] RISC-V: Consolidate amo testcase variants

2024-06-25 Thread Patrick O'Neill
Many riscv/amo/ testcases use check-function-bodies. These testcases can be
consolidated with related testcases (memory ordering variants) without affecting
the assertions.

Give functions descriptive names so testsuite failures are obvious from the
'FAIL:' line.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/amo/amo-table-a-6-amo-add-1.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-2.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-3.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-4.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-5.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-fence-1.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-fence-2.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-fence-3.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-fence-4.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-fence-5.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-amo-add-1.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-amo-add-2.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-amo-add-3.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-amo-add-4.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-amo-add-5.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-fence-1.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-fence-2.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-fence-3.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-fence-4.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-fence-5.c: Removed.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-1.c: Removed.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-2.c: Removed.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-3.c: Removed.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-4.c: Removed.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-5.c: Removed.
* gcc.target/riscv/amo/a-rvwmo-fence.c: New test.
* gcc.target/riscv/amo/a-ztso-fence.c: New test.
* gcc.target/riscv/amo/zaamo-rvwmo-amo-add-int.c: New test.
* gcc.target/riscv/amo/zaamo-ztso-amo-add-int.c: New test.
* gcc.target/riscv/amo/zalrsc-rvwmo-amo-add-int.c: New test.
* gcc.target/riscv/amo/zalrsc-ztso-amo-add-int.c: New test.

Signed-off-by: Patrick O'Neill 
---
 .../gcc.target/riscv/amo/a-rvwmo-fence.c  | 56 +
 .../gcc.target/riscv/amo/a-ztso-fence.c   | 52 +
 .../riscv/amo/amo-table-a-6-amo-add-1.c   | 17 
 .../riscv/amo/amo-table-a-6-amo-add-2.c   | 17 
 .../riscv/amo/amo-table-a-6-amo-add-3.c   | 17 
 .../riscv/amo/amo-table-a-6-amo-add-4.c   | 17 
 .../riscv/amo/amo-table-a-6-amo-add-5.c   | 17 
 .../riscv/amo/amo-table-a-6-fence-1.c | 15 
 .../riscv/amo/amo-table-a-6-fence-2.c | 16 
 .../riscv/amo/amo-table-a-6-fence-3.c | 16 
 .../riscv/amo/amo-table-a-6-fence-4.c | 16 
 .../riscv/amo/amo-table-a-6-fence-5.c | 16 
 .../riscv/amo/amo-table-ztso-amo-add-1.c  | 17 
 .../riscv/amo/amo-table-ztso-amo-add-2.c  | 17 
 .../riscv/amo/amo-table-ztso-amo-add-3.c  | 17 
 .../riscv/amo/amo-table-ztso-amo-add-4.c  | 17 
 .../riscv/amo/amo-table-ztso-amo-add-5.c  | 17 
 .../riscv/amo/amo-table-ztso-fence-1.c| 15 
 .../riscv/amo/amo-table-ztso-fence-2.c| 15 
 .../riscv/amo/amo-table-ztso-fence-3.c| 15 
 .../riscv/amo/amo-table-ztso-fence-4.c| 15 
 .../riscv/amo/amo-table-ztso-fence-5.c| 16 
 .../riscv/amo/amo-zalrsc-amo-add-1.c  | 22 --
 .../riscv/amo/amo-zalrsc-amo-add-2.c  | 22 --
 .../riscv/amo/amo-zalrsc-amo-add-3.c  | 22 --
 .../riscv/amo/amo-zalrsc-amo-add-4.c  | 22 --
 .../riscv/amo/amo-zalrsc-amo-add-5.c  | 22 --
 .../riscv/amo/zaamo-rvwmo-amo-add-int.c   | 57 ++
 .../riscv/amo/zaamo-ztso-amo-add-int.c| 57 ++
 .../riscv/amo/zalrsc-rvwmo-amo-add-int.c  | 78 +++
 .../riscv/amo/zalrsc-ztso-amo-add-int.c   | 78 +++
 31 files changed, 378 insertions(+), 435 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo/a-rvwmo-fence.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo/a-ztso-fence.c
 delete mode 100644 gcc/testsuite/gcc.target/riscv/amo/amo-table-a-6-amo-add-1.c
 delete mode 100644 gcc/testsuite/gcc.target/riscv/amo/amo-table-a-6-amo-add-2.c
 delete mode 100644 gcc/testsuite/gcc.target/riscv/amo/amo-table-a-6-amo-add-3.c
 delete mode 100644 gcc/testsuite/gcc.target/riscv/amo/amo-table-a-6-amo-add-4.c
 delete mode 100644 gcc/testsuite/gcc.target/riscv/amo/amo-table-a-6-amo-add-5.c
 delete mode 100644 gcc/testsuite/gcc.target/riscv/amo/amo-table-a-6-fence-1.c
 delete mode 100644 gcc/testsuite/gcc.target/riscv/amo/amo-table-a-6-fence-2.c
 delete mode 100644 gcc/testsu

[PATCH 3/3] RISC-V: Update testcase comments to point to PSABI rather than Table A.6

2024-06-25 Thread Patrick O'Neill
Table A.6 was originally the source of truth for the recommended mappings.
Point to the PSABI doc since the memory model mappings have been moved there.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/amo/a-rvwmo-fence.c: Replace A.6 reference with 
PSABI.
* gcc.target/riscv/amo/a-rvwmo-load-acquire.c: Ditto.
* gcc.target/riscv/amo/a-rvwmo-load-relaxed.c: Ditto.
* gcc.target/riscv/amo/a-rvwmo-load-seq-cst.c: Ditto.
* gcc.target/riscv/amo/a-rvwmo-store-compat-seq-cst.c: Ditto.
* gcc.target/riscv/amo/a-rvwmo-store-relaxed.c: Ditto.
* gcc.target/riscv/amo/a-rvwmo-store-release.c: Ditto.
* gcc.target/riscv/amo/a-ztso-fence.c: Ditto.
* gcc.target/riscv/amo/a-ztso-load-acquire.c: Ditto.
* gcc.target/riscv/amo/a-ztso-load-relaxed.c: Ditto.
* gcc.target/riscv/amo/a-ztso-load-seq-cst.c: Ditto.
* gcc.target/riscv/amo/a-ztso-store-compat-seq-cst.c: Ditto.
* gcc.target/riscv/amo/a-ztso-store-relaxed.c: Ditto.
* gcc.target/riscv/amo/a-ztso-store-release.c: Ditto.
* gcc.target/riscv/amo/zaamo-rvwmo-amo-add-int.c: Ditto.
* gcc.target/riscv/amo/zaamo-ztso-amo-add-int.c: Ditto.
* 
gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire-release.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-consume.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-relaxed.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-release.c: 
Ditto.
* 
gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acq-rel.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acquire.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-relaxed.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-release.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-seq-cst.c: 
Ditto.
* 
gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire-release.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-consume.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-relaxed.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-release.c: 
Ditto.
* 
gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-acq-rel.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-acquire.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-relaxed.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-release.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-seq-cst.c: 
Ditto.

Signed-off-by: Patrick O'Neill 
---
 gcc/testsuite/gcc.target/riscv/amo/a-rvwmo-fence.c | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/a-rvwmo-load-acquire.c  | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/a-rvwmo-load-relaxed.c  | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/a-rvwmo-load-seq-cst.c  | 2 +-
 .../gcc.target/riscv/amo/a-rvwmo-store-compat-seq-cst.c| 3 ++-
 gcc/testsuite/gcc.target/riscv/amo/a-rvwmo-store-relaxed.c | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/a-rvwmo-store-release.c | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/a-ztso-fence.c  | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/a-ztso-load-acquire.c   | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/a-ztso-load-relaxed.c   | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/a-ztso-load-seq-cst.c   | 2 +-
 .../gcc.target/riscv/amo/a-ztso-store-compat-seq-cst.c | 3 ++-
 gcc/testsuite/gcc.target/riscv/amo/a-ztso-store-relaxed.c  | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/a-ztso-store-release.c  | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/zaamo-rvwmo-amo-add-int.c   | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/zaamo-ztso-amo-add-int.c| 2 +-
 .../amo/zalrsc-rvwmo-compare-exchange-int-acquire-release.c| 2 +-
 .../riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire.c  | 2 +-
 .../riscv/amo/zalrsc-rvwmo-compare-exchange-int-consume.c  | 2 +-
 .../riscv/amo/zalrsc-rvwmo-compare-exchange-int-relaxed.c  | 2 +-
 .../riscv/amo/zalrsc-rvwmo-compare-exchange-int-release.c  | 2 +-
 .../amo/zalrsc-rvwmo-compare-exchange-int-seq-cst-relaxed.c| 3 ++-
 .../riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst.c  | 2 +-
 .../riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acq-rel.c  | 2 +-
 .../riscv/amo/

Re: [PATCH] c++: ICE with __has_unique_object_representations [PR115476]

2024-06-25 Thread Jason Merrill

On 6/25/24 15:07, Marek Polacek wrote:

On Tue, Jun 25, 2024 at 06:22:56PM +0100, Jonathan Wakely wrote:

On Tue, 25 Jun 2024 at 16:17, Jason Merrill  wrote:


On 6/25/24 07:15, Jonathan Wakely wrote:

On Tue, 25 Jun 2024 at 03:12, Jason Merrill  wrote:


On 6/18/24 10:31, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/14/13?


Makes sense to me, though probably the [meta.unary.prop] table should be
adjusted in the same way.  Jonathan, what do you think?


Just to make sure I understand correctly, the suggestion is to change
the precondition for the trait to something like:

"remove_all_extents_t shall be a complete type or cv void."

i.e. if T is incomplete then T[] cannot be used with the trait, right?


Yes.


Thanks. This is now https://cplusplus.github.io/LWG/issue4113


Looks good.  Should I push my patch now?


Yes, please.

Jason



Re: [PATCH][c++ frontend]: check for missing condition for novector [PR115623]

2024-06-25 Thread Jason Merrill

On 6/25/24 12:52, Tamar Christina wrote:

The 06/25/2024 17:10, Jason Merrill wrote:

On 6/25/24 04:01, Tamar Christina wrote:

Hi All,

It looks like I forgot to check in the C++ frontend if a condition exist for the
loop being adorned with novector.  This causes a segfault because cond isn't
expected to be null.

This fixes it by issuing the same kind of diagnostics we issue for the other
pragmas.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master? and backport to GCC-14?


Hmm, I'm not sure we want to error in this case; it's pointless, but
indeed we aren't going to vectorize a loop that always loops.  I'd think
we should treat it the same as an explicit "true" condition.  And
perhaps the same for unroll/ivdep.

Does the C front-end treat the null condition different from a constant
true condition?



No, in the C front-end we error for ivdep and unroll, but for novector we 
explicitly
suppress it by checking for novector && cond && cond != error_mark_node instead 
of
just novector && cond != error_mark_node in the use site.

Do you want to handle it that way to be consistent?


Please.

Jason



Re: [PATCH] c++: decltype of by-ref capture proxy of ref [PR115504]

2024-06-25 Thread Jason Merrill

On 6/25/24 11:45, Patrick Palka wrote:

On Tue, 25 Jun 2024, Jason Merrill wrote:


On 6/25/24 11:03, Patrick Palka wrote:

On Mon, 24 Jun 2024, Jason Merrill wrote:


On 6/24/24 21:00, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
for trunk/14?

-- >8 --

The capture proxy handling in finish_decltype_type added in r14-5330
was stripping the reference type of a capture proxy's captured variable,
which is desirable for a by-value capture, but not for a by-ref capture
(of a reference).


I'm not sure why we would want it for by-value, either; regardless of the
capture kind, decltype(x) is int&.


Ah, makes sense.  But I guess that means

void f(int& x) {
  [x]() {
decltype(auto) a = x;
  }
}

is ill-formed since decltype(x) is int& but the corresponding closure
member is const?  It works if we make the lambda mutable.


Yes, and clang agrees.  Let's also test that case.


Like so?


OK.


-- >8 --

Subject: [PATCH] c++: decltype of capture proxy of ref [PR115504]

The capture proxy handling in finish_decltype_type added in r14-5330
was stripping the reference type of a capture proxy's captured variable.

PR c++/115504

gcc/cp/ChangeLog:

* semantics.cc (finish_decltype_type): Don't strip the reference
type (if any) of a capture proxy's captured variable.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/decltype-auto8.C: New test.
---
  gcc/cp/semantics.cc |  1 -
  gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C | 22 +
  2 files changed, 22 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 6c1813d37c6..6a383c0f7f9 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12071,7 +12071,6 @@ finish_decltype_type (tree expr, bool 
id_expression_or_member_access_p,
{
  expr = DECL_CAPTURED_VARIABLE (expr);
  type = TREE_TYPE (expr);
- type = non_reference (type);
}
  else
{
diff --git a/gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C 
b/gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C
new file mode 100644
index 000..55135cecf72
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/decltype-auto8.C
@@ -0,0 +1,22 @@
+// PR c++/115504
+// { dg-do compile { target c++14 } }
+
+void f(int& x, const int& y) {
+  [&x]() {
+decltype(auto) a = x;
+using type = decltype(x);
+using type = decltype(a);
+using type = int&; // not 'int'
+  };
+
+  [x]() {
+decltype(auto) a = x; // { dg-error "discards qualifiers" }
+  };
+
+  [x]() mutable {
+decltype(auto) a = x;
+using type = decltype(x);
+using type = decltype(a);
+using type = int&;
+  };
+}




Re: [PATCH v2 3/3] RISC-V: cmpmem for RISCV with V extension

2024-06-25 Thread Jeff Law




On 12/19/23 2:53 AM, Sergei Lewis wrote:

gcc/ChangeLog:

 * config/riscv/riscv-protos.h (riscv_vector::expand_vec_cmpmem): New 
function
 declaration.

 * config/riscv/riscv-string.cc (riscv_vector::expand_vec_cmpmem): New
 function; this generates an inline vectorised memory compare, if and only 
if
 we know the entire operation can be performed in a single vector load per
 input

 * config/riscv/riscv.md (cmpmemsi): Try riscv_vector::expand_vec_cmpmem for
 constant lengths

gcc/testsuite/ChangeLog:

 * gcc.target/riscv/rvv/base/cmpmem-1.c: New codegen tests
 * gcc.target/riscv/rvv/base/cmpmem-2.c: New execution tests
 * gcc.target/riscv/rvv/base/cmpmem-3.c: New codegen tests
 * gcc.target/riscv/rvv/base/cmpmem-4.c: New codegen tests
So I just commited the updated version of this patch to the trunk. 
Thanks again for the work and your patience.  I'll sort out patch #1 of 
this series next :-)


jeff



Re: [PATCH] Add param for bb limit to invoke fast_vrp.

2024-06-25 Thread David Malcolm
On Mon, 2024-06-24 at 21:27 -0700, Andrew Pinski wrote:
> On Mon, Jun 24, 2024 at 7:35 PM Andrew Pinski 
> wrote:
> > 
> > On Mon, Jun 24, 2024 at 7:20 PM Andrew MacLeod
> >  wrote:
> > > 
> > > 
> > > On 6/22/24 09:15, Richard Biener wrote:
> > > > On Fri, Jun 21, 2024 at 3:02 PM Andrew MacLeod
> > > >  wrote:
> > > > > This patch adds
> > > > > 
> > > > >   --param=vrp-block-limit=N
> > > > > 
> > > > > When the basic block counter for a function exceeded 'N' ,
> > > > > VRP is
> > > > > invoked with the new fast_vrp algorithm instead.   This
> > > > > algorithm uses a
> > > > > lot less memory and processing power, although it does get a
> > > > > few less
> > > > > things.
> > > > > 
> > > > > Primary motivation is cases like
> > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114855 in which
> > > > > the 3  VRP
> > > > > passes consume about 600 seconds of the compile time, and a
> > > > > lot of
> > > > > memory.  With fast_vrp, it spends less than 10 seconds
> > > > > total in the
> > > > > 3 passes of VRP. This test case has about 400,000 basic
> > > > > blocks.
> > > > > 
> > > > > The default for N in this patch is 150,000,  arbitrarily
> > > > > chosen.
> > > > > 
> > > > > This bootstraps, (and I bootstrapped it with --param=vrp-
> > > > > block-limit=0
> > > > > as well) on x86_64-pc-linux-gnu, with no regressions.
> > > > > 
> > > > > What do you think, OK for trunk?
> > > > +  if (last_basic_block_for_fn (fun) >
> > > > param_vrp_block_limit ||
> > > > + &data == &pass_data_fast_vrp)
> > > > 
> > > > > > goes to the next line.
> > > > 
> > > > Btw, we have -Wdisabled-optimization for these cases which
> > > > should
> > > > say sth like "function has excess of %d number of basic blocks
> > > > (--param vrp-block-limit=%d), using fast VRP algorithm"
> > > > or so in this case.
> > > > 
> > > > As I wrote in the PR the priority should be -O1 compile-time
> > > > performance and memory use.
> > > 
> > > 
> > > Yeah, I just wanted to use it as a model for "bad" cases for
> > > ranger.
> > > Adjusted patch attached which now issues the warning.
> > > 
> > > I also found that the transitive relations were causing a small
> > > blowup
> > > in time for relation processing now that I turned relations on
> > > for fast
> > > VRP.  I commited a patch and fast_vrp no longer does transitives.
> > > 
> > > If you want to experiment with enabling fast VRP at -O1, it
> > > should be
> > > fast all the time now.  I think :-)    This testcase runs in
> > > about 95
> > > seconds on my test machine.  if I turn on VRP, a single VRP pass
> > > takes
> > > about 2.5 seconds.    Its all set up, you can just add:
> > > 
> > > NEXT_PASS (pass_fast_vrp);
> > > 
> > > at an appropriate spot.
> > > 
> > > > Richard.
> > > > 
> > > > > Andrew
> > > > > 
> > > > > PS sorry,. it doesn't help the threader in that PR :-(
> > > > It's probably one of the known bottlenecks there - IIRC the
> > > > path range
> > > > queries make O(n^2) work.  I can look at this at some point as
> > > > I've
> > > > dealt with the slowness of this pass in the past.
> > > > 
> > > > There is param_max_jump_thread_paths that should limit
> > > > searching
> > > > but there is IIRC no way to limit the work path ranger actually
> > > > does
> > > > when resolving the query.
> > > > 
> > > Yeah, Id like to talk to Aldy about revamping the threader now
> > > that some
> > > of the newer facilities are available that fast_vrp uses.
> > > 
> > > We can calculate all the outgoing ranges for a block at once with
> > > :
> > > 
> > >    // Fill ssa-cache R with any outgoing ranges on edge E, using
> > > QUERY.
> > >    bool gori_on_edge (class ssa_cache &r, edge e, range_query
> > > *query =
> > > NULL);
> > > 
> > > This is what the fast_vrp routines uses.  We can gather all range
> > > restrictions generated from an edge efficiently just once and
> > > then
> > > intersect them with a known range as we walk the different paths.
> > > We
> > > don't need the gori exports , nor any of the other on-demand bits
> > > where
> > > we calculate each export range dynamically.. I suspect it would
> > > reduce
> > > the workload and memory impact quite a bit, but I'm not really
> > > familiar
> > > with exactly how the threader uses those things.
> > > 
> > > It'd require some minor tweaking to the lazy_ssa_cache to make
> > > the
> > > bitmap of names set accessible. This  would provide similar
> > > functionality to what the gori export () routine provides.  Both
> > > relations and inferred ranges should only need to be calculated
> > > once per
> > > block as well and could/should/would be applied the same way if
> > > they are
> > > present.   I don't *think* the threader uses any of the def
> > > chains, but
> > > Aldy can chip in.
> > 
> > +   warning (OPT_Wdisabled_optimization,
> > +    "Using fast VRP algorithm. %d basic blocks"
> > +    " exceeds %s%d limit",
> > +    n_basic_blocks_for_fn (fun),
> > +    "--param=vrp-bloc

Re: [PATCH 0/3] RISC-V: AMO testsuite cleanup

2024-06-25 Thread Jeff Law




On 6/25/24 3:14 PM, Patrick O'Neill wrote:

This is another round of AMO testcase cleanup. Consolidates a lot of testcases
and unifies the testcase names.

Patrick O'Neill (3):
   RISC-V: Rename amo testcases
   RISC-V: Consolidate amo testcase variants
   RISC-V: Update testcase comments to point to PSABI rather than Table
 A.6

[ ... ]
This series is OK for the trunk.

Thanks,
Jeff



Re: [PATCH 1/2] Record edge true/false value for gcov

2024-06-25 Thread Jeff Law




On 6/25/24 2:04 AM, Jørgen Kvalsvik wrote:

Make gcov aware which edges are the true/false to more accurately
reconstruct the CFG.  There are plenty of bits left in arc_info and it
opens up for richer reporting.

gcc/ChangeLog:

* gcov-io.h (GCOV_ARC_TRUE): New.
(GCOV_ARC_FALSE): New.
* gcov.cc (struct arc_info): Add true_value, false_value.
(read_graph_file): Read true_value, false_value.
* profile.cc (branch_prob): Write GCOV_ARC_TRUE, GCOV_ARC_FALSE.

I thought I'd already acked this patch.

So OK, again :-)

jeff



[PATCH] i386: Remove declaration of unused functions

2024-06-25 Thread Evgeny Karpov
The patch fixes the issue introduced in
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=63512c72df09b43d56ac7680cdfd57a66d40c636
and reported at
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655599.html .

Regards,
Evgeny


The patch fixes the issue with compilation on x86_64-gnu-linux
when warnings for unused functions are treated as errors.

gcc/ChangeLog:

* config/i386/i386.cc (legitimize_dllimport_symbol): Remove unused
functions.
(legitimize_pe_coff_extern_decl): Likewise.
---
 gcc/config/i386/i386.cc | 2 --
 1 file changed, 2 deletions(-)

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index aee88b08ae9..6d6a478f6f5 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -104,8 +104,6 @@ along with GCC; see the file COPYING3.  If not see
 /* This file should be included last.  */
 #include "target-def.h"
 
-static rtx legitimize_dllimport_symbol (rtx, bool);
-static rtx legitimize_pe_coff_extern_decl (rtx, bool);
 static void ix86_print_operand_address_as (FILE *, rtx, addr_space_t, bool);
 static void ix86_emit_restore_reg_using_pop (rtx, bool = false);
 
-- 
2.25.1



[PATCH] RISC-V: Add support for Zabha extension

2024-06-25 Thread Patrick O'Neill
From: Gianluca Guida 

The Zabha extension adds support for subword Zaamo ops.

Extension: https://github.com/riscv/riscv-zabha.git
Ratification: https://jira.riscv.org/browse/RVS-1685

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc
(riscv_subset_list::to_string): Skip zabha when not supported by
the assembler.
* config.in: Regenerate.
* config/riscv/arch-canonicalize: Make zabha imply zaamo.
* config/riscv/iterators.md (amobh): Add iterator for amo
byte/halfword.
* config/riscv/riscv.opt: Add zabha.
* config/riscv/sync.md (atomic_): Add
subword atomic op pattern.
(zabha_atomic_fetch_): Add subword
atomic_fetch op pattern.
(lrsc_atomic_fetch_): Prefer zabha over lrsc
for subword atomic ops.
(zabha_atomic_exchange): Add subword atomic exchange
pattern.
(lrsc_atomic_exchange): Prefer zabha over lrsc for subword
atomic exchange ops.
* configure: Regenerate.
* configure.ac: Add zabha assembler check.
* doc/sourcebuild.texi: Add zabha documentation.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Add zabha testsuite infra support.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-1.c: Remove zabha
to continue to test the lr/sc subword patterns.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-2.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-3.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-4.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-5.c: Ditto.
* gcc.target/riscv/amo/amo-table-ztso-subword-amo-add-1.c: Ditto.
* gcc.target/riscv/amo/amo-table-ztso-subword-amo-add-2.c: Ditto.
* gcc.target/riscv/amo/amo-table-ztso-subword-amo-add-3.c: Ditto.
* gcc.target/riscv/amo/amo-table-ztso-subword-amo-add-4.c: Ditto.
* gcc.target/riscv/amo/amo-table-ztso-subword-amo-add-5.c: Ditto.
* gcc.target/riscv/amo/inline-atomics-1.c: Ditto.
* gcc.target/riscv/amo/inline-atomics-2.c: Ditto.
* gcc.target/riscv/amo/zabha-all-amo-ops-char-run.c: New test.
* gcc.target/riscv/amo/zabha-all-amo-ops-short-run.c: New test.
* gcc.target/riscv/amo/zabha-rvwmo-all-amo-ops-char.c: New test.
* gcc.target/riscv/amo/zabha-rvwmo-all-amo-ops-short.c: New test.
* gcc.target/riscv/amo/zabha-rvwmo-amo-add-char.c: New test.
* gcc.target/riscv/amo/zabha-rvwmo-amo-add-short.c: New test.
* gcc.target/riscv/amo/zabha-ztso-amo-add-char.c: New test.
* gcc.target/riscv/amo/zabha-ztso-amo-add-short.c: New test.

Co-Authored-By: Patrick O'Neill 
---
Gianluca Guida created the initial patch. Rebased and added more
testcases/docs/etc.

This will trivially conflict with the testsuite cleanup [1] I sent but
I'll rebase as needed.

Tested using amo.exp with rv64gc_zalrsc, rv64id_zaamo, rv64id_zalrsc,
rv64id_zabha (using tip-of-tree qemu w/ zabha patches [2] applied for
execution tests).
Relying on precommit for full testing.

[1] 
https://inbox.sourceware.org/gcc-patches/42eae510-1804-40d2-8b6c-6f4c9c170...@gmail.com/T/#t
[2] 
https://lore.kernel.org/all/fed99165-58da-458c-b68f-a9717fc15...@linux.alibaba.com/T/
---
 gcc/common/config/riscv/riscv-common.cc   | 12 +++
 gcc/config.in |  6 ++
 gcc/config/riscv/arch-canonicalize|  3 +
 gcc/config/riscv/iterators.md |  3 +
 gcc/config/riscv/riscv.opt|  2 +
 gcc/config/riscv/sync.md  | 81 ++-
 gcc/configure | 31 +++
 gcc/configure.ac  |  5 ++
 gcc/doc/sourcebuild.texi  | 12 ++-
 .../amo/amo-table-a-6-subword-amo-add-1.c |  1 +
 .../amo/amo-table-a-6-subword-amo-add-2.c |  1 +
 .../amo/amo-table-a-6-subword-amo-add-3.c |  1 +
 .../amo/amo-table-a-6-subword-amo-add-4.c |  1 +
 .../amo/amo-table-a-6-subword-amo-add-5.c |  1 +
 .../amo/amo-table-ztso-subword-amo-add-1.c|  1 +
 .../amo/amo-table-ztso-subword-amo-add-2.c|  1 +
 .../amo/amo-table-ztso-subword-amo-add-3.c|  1 +
 .../amo/amo-table-ztso-subword-amo-add-4.c|  1 +
 .../amo/amo-table-ztso-subword-amo-add-5.c|  1 +
 .../gcc.target/riscv/amo/inline-atomics-1.c   |  1 +
 .../gcc.target/riscv/amo/inline-atomics-2.c   |  1 +
 .../riscv/amo/zabha-all-amo-ops-char-run.c|  5 ++
 .../riscv/amo/zabha-all-amo-ops-short-run.c   |  5 ++
 .../riscv/amo/zabha-rvwmo-all-amo-ops-char.c  | 23 ++
 .../riscv/amo/zabha-rvwmo-all-amo-ops-short.c | 23 ++
 .../riscv/amo/zabha-rvwmo-amo-add-char.c  | 57 +
 .../riscv/amo/zabha-rvwmo-amo-add-short.c | 57 +
 .../riscv/amo/zabha-ztso-amo-add-char.c   | 57 +
 .../riscv/amo/zabha-ztso-amo-add-short.c  | 57 +
 gcc/testsuite/lib/targ

[committed][gcc-11] libstdc++: Replace reference to mainline in release branch docs

2024-06-25 Thread Jonathan Wakely
libstdc++-v3/ChangeLog:

* doc/xml/manual/status_cxx2023.xml: Change reference from
mainline GCC to the release branch.
* doc/html/manual/status.html: Regenerate.
---
 libstdc++-v3/doc/html/manual/status.html   | 3 +--
 libstdc++-v3/doc/xml/manual/status_cxx2023.xml | 3 +--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/doc/html/manual/status.html 
b/libstdc++-v3/doc/html/manual/status.html
index 3805e9e24f0..838cba72f47 100644
--- a/libstdc++-v3/doc/html/manual/status.html
+++ b/libstdc++-v3/doc/html/manual/status.html
@@ -1702,8 +1702,7 @@ options. The pre-defined symbol
 __cplusplus is used to check for the
 presence of the required flag.
 
-This section describes the C++23 and library TS support in mainline GCC,
-not in any particular release.
+This section describes the C++23 and library TS support in the GCC 11 release 
series.
 
 The following table lists new library features that have been accepted into
 the C++23 working draft. The "Proposal" column provides a link to the
diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2023.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2023.xml
index 75f31f55aa9..381f989d482 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2023.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2023.xml
@@ -20,8 +20,7 @@ presence of the required flag.
 
 
 
-This section describes the C++23 and library TS support in mainline GCC,
-not in any particular release.
+This section describes the C++23 and library TS support in the GCC 11 release 
series.
 
 
 
-- 
2.45.2



[committed][gcc-12] libstdc++: Remove confusing text from status tables for release branch

2024-06-25 Thread Jonathan Wakely
Pushed to gcc-12.

-- >8 --

When I tried to make the release branch versions of these docs refer to
the release branch instead of "mainline GCC", for some reason I left the
text "not any particular release" there. That's just confusing, because
the docs are for a particular release, the latest on that branch. Remove
that confusing text in several places.

libstdc++-v3/ChangeLog:

* doc/xml/manual/status_cxx2023.xml: Change reference from
mainline GCC to the release branch.
* doc/xml/manual/status_cxx1998.xml: Remove confusing "not in
any particular release" text.
* doc/xml/manual/status_cxx2011.xml: Likewise.
* doc/xml/manual/status_cxx2014.xml: Likewise.
* doc/xml/manual/status_cxx2017.xml: Likewise.
* doc/xml/manual/status_cxx2020.xml: Likewise.
* doc/xml/manual/status_cxxtr1.xml: Likewise.
* doc/xml/manual/status_cxxtr24733.xml: Likewise.
* doc/html/manual/status.html: Regenerate.
---
 libstdc++-v3/doc/html/manual/status.html  | 24 +++
 .../doc/xml/manual/status_cxx1998.xml |  3 +--
 .../doc/xml/manual/status_cxx2011.xml |  3 +--
 .../doc/xml/manual/status_cxx2014.xml |  3 +--
 .../doc/xml/manual/status_cxx2017.xml |  3 +--
 .../doc/xml/manual/status_cxx2020.xml |  3 +--
 .../doc/xml/manual/status_cxx2023.xml |  3 +--
 libstdc++-v3/doc/xml/manual/status_cxxtr1.xml |  3 +--
 .../doc/xml/manual/status_cxxtr24733.xml  |  3 +--
 9 files changed, 16 insertions(+), 32 deletions(-)

diff --git a/libstdc++-v3/doc/html/manual/status.html 
b/libstdc++-v3/doc/html/manual/status.html
index a6494912c07..c212ea64fe6 100644
--- a/libstdc++-v3/doc/html/manual/status.html
+++ b/libstdc++-v3/doc/html/manual/status.html
@@ -5,8 +5,7 @@
  NextChapter 1. StatusTable of ContentsImplementation 
StatusC++ 
1998/2003Implementation 
StatusImplementation Specific 
BehaviorC++ 2011Implementation Specific 
BehaviorC++ 2014Implementation Specific 
BehaviorFilesystem 
TSC++ 2017Implementation Specific 
BehaviorParallelism 2 
TSC++ 2020C++ 
2023C++ TR1Implementation Specific 
BehaviorC++ TR 24733C++ IS 
29124Implementation Specific 
BehaviorLicenseThe Code: 
GPLThe Documentation: GPL, 
FDLBugsImplementation 
BugsStandard 
BugsImplementation 
StatusC++ 
1998/2003Implementation Status
 This status table is based on the table of contents of ISO/IEC 14882:2003.
 
-This section describes the C++ support in the GCC 12 release series,
-not in any particular release.
+This section describes the C++ support in the GCC 12 release series.
 Table 1.1. C++ 1998/2003 Implementation 
StatusSectionDescriptionStatusComments
18
   
@@ -160,8 +159,7 @@ since that release.
 
 This status table is based on the table of contents of ISO/IEC 14882:2011.
 
-This section describes the C++11 support in the GCC 12 release series,
-not in any particular release.
+This section describes the C++11 support in the GCC 12 release series.
 Table 1.2. C++ 2011 Implementation 
StatusSectionDescriptionStatusComments
18
   
@@ -433,8 +431,7 @@ This status table is based on the table of contents of 
ISO/IEC 14882:2014.
 Some subclauses are not shown in the table where the content is unchanged
 since C++11 and the implementation is complete.
 
-This section describes the C++14 and library TS support in the GCC 12 release 
series,
-not in any particular release.
+This section describes the C++14 and library TS support in the GCC 12 release 
series.
 Table 1.3. C++ 2014 Implementation 
StatusSectionDescriptionStatusComments
18
   
@@ -578,8 +575,7 @@ GCC 9.1 was the first release with non-experimental C++17 
support,
 so the API and ABI of features added in C++17 is only stable
 since that release.
 
-This section describes the C++17 and library TS support in the GCC 12 release 
series,
-not in any particular release.
+This section describes the C++17 and library TS support in the GCC 12 release 
series.
 
 The following table lists new library features that are included in
 the C++17 standard. The "Proposal" column provides a link to the
@@ -1254,8 +1250,7 @@ options. The pre-defined symbol
 __cplusplus is used to check for the
 presence of the required flag.
 
-This section describes the C++20 and library TS support in the GCC 12 release 
series,
-not in any particular release.
+This section describes the C++20 and library TS support in the GCC 12 release 
series.
 
 The following table lists new library features that are included in
 the C++20 standard. The "Proposal" column provides a link to the
@@ -1720,8 +1715,7 @@ options. The pre-defined symbol
 __cplusplus is used to check for the
 presence of the required flag.
 
-This section describes the C++23 and library TS support in mainline GCC,
-not in any particular release.
+This section describes the C++23 and library TS support in the GCC 12 release 
series.
 
 The following table lists new library features that have b

[committed][gcc-13] libstdc++: Remove confusing text from status tables for release branch

2024-06-25 Thread Jonathan Wakely
Pushed to gcc-13.

-- >8 --

When I tried to make the release branch versions of these docs refer to
the release branch instead of "mainline GCC", for some reason I left the
text "not any particular release" there. That's just confusing, because
the docs are for a particular release, the latest on that branch. Remove
that confusing text in several places.

libstdc++-v3/ChangeLog:

* doc/xml/manual/status_cxx1998.xml: Remove confusing "not in
any particular release" text.
* doc/xml/manual/status_cxx2011.xml: Likewise.
* doc/xml/manual/status_cxx2014.xml: Likewise.
* doc/xml/manual/status_cxx2017.xml: Likewise.
* doc/xml/manual/status_cxx2020.xml: Likewise.
* doc/xml/manual/status_cxx2023.xml: Likewise.
* doc/xml/manual/status_cxxtr1.xml: Likewise.
* doc/xml/manual/status_cxxtr24733.xml: Likewise.
* doc/html/manual/status.html: Regenerate.
---
 libstdc++-v3/doc/html/manual/status.html | 16 
 libstdc++-v3/doc/xml/manual/status_cxx1998.xml   |  2 +-
 libstdc++-v3/doc/xml/manual/status_cxx2011.xml   |  2 +-
 libstdc++-v3/doc/xml/manual/status_cxx2014.xml   |  2 +-
 libstdc++-v3/doc/xml/manual/status_cxx2017.xml   |  2 +-
 libstdc++-v3/doc/xml/manual/status_cxx2020.xml   |  2 +-
 libstdc++-v3/doc/xml/manual/status_cxx2023.xml   |  2 +-
 libstdc++-v3/doc/xml/manual/status_cxxtr1.xml|  2 +-
 .../doc/xml/manual/status_cxxtr24733.xml |  2 +-
 9 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/libstdc++-v3/doc/html/manual/status.html 
b/libstdc++-v3/doc/html/manual/status.html
index 53957bd2e0d..a2c89002289 100644
--- a/libstdc++-v3/doc/html/manual/status.html
+++ b/libstdc++-v3/doc/html/manual/status.html
@@ -6,7 +6,7 @@
 This status table is based on the table of contents of ISO/IEC 14882:2003.
 
 This section describes the C++ support in
-the GCC 13 release series, not in any particular release.
+the GCC 13 release series.
 Table 1.1. C++ 1998/2003 Implementation 
StatusSectionDescriptionStatusComments
18
   
@@ -161,7 +161,7 @@ since that release.
 This status table is based on the table of contents of ISO/IEC 14882:2011.
 
 This section describes the C++11 support in
-the GCC 13 release series, not in any particular release.
+the GCC 13 release series.
 Table 1.2. C++ 2011 Implementation 
StatusSectionDescriptionStatusComments
18
   
@@ -434,7 +434,7 @@ Some subclauses are not shown in the table where the 
content is unchanged
 since C++11 and the implementation is complete.
 
 This section describes the C++14 and library TS support in
-the GCC 13 release series, not in any particular release.
+the GCC 13 release series.
 Table 1.3. C++ 2014 Implementation 
StatusSectionDescriptionStatusComments
18
   
@@ -579,7 +579,7 @@ so the API and ABI of features added in C++17 is only stable
 since that release.
 
 This section describes the C++17 and library TS support in
-the GCC 13 release series, not in any particular release.
+the GCC 13 release series.
 
 The following table lists new library features that are included in
 the C++17 standard. The "Proposal" column provides a link to the
@@ -1255,7 +1255,7 @@ options. The pre-defined symbol
 presence of the required flag.
 
 This section describes the C++20 and library TS support in
-the GCC 13 release series, not in any particular release.
+the GCC 13 release series.
 
 The following table lists new library features that are included in
 the C++20 standard. The "Proposal" column provides a link to the
@@ -1725,7 +1725,7 @@ options. The pre-defined symbol
 presence of the required flag.
 
 This section describes the C++23 and library TS support in
-the GCC 13 release series, not in any particular release.
+the GCC 13 release series.
 
 The following table lists new library features that have been accepted into
 the C++23 working draft. The "Proposal" column provides a link to the
@@ -2123,7 +2123,7 @@ In this implementation the header names are prefixed by
 , and so on.
 
 This page describes the TR1 support in
-the GCC 13 release series, not in any particular release.
+the GCC 13 release series.
 Table 1.11. C++ TR1 Implementation 
StatusSectionDescriptionStatusComments2General Utilities2.1Reference wrappers 
 2.1.1Additions to header  
synopsisY 2.1.2Class template reference_wrapper  2.1.2.1reference_wrapper 
construct/copy/destroyY 
2.1.2.2reference_wrapper assignmentY 2.1.2.3reference_wrapper accessY 2.1.2.4reference_wrapper invocationY 2.1.2.5reference_wrapper helper 
functionsY 2.2Smart pointers 
 2.2.1Additions to header  
synopsisY 2.2.2Class bad_weak_ptrY 
2.2.3Class template shared_ptr 

  Uses code from
@@ -2144,7 +2144,7 @@ ISO/IEC TR 24733:2011,
 decimal floating-point arithmetic".
 
 This page describes the TR 24733 support in
-the GCC 13 release series, not in any particular release.
+the GCC 13 release series.
 Table 

Re: [committed][gcc-13] libstdc++: Remove confusing text from status tables for release branch

2024-06-25 Thread Jonathan Wakely
On Tue, 25 Jun 2024 at 23:34, Jonathan Wakely  wrote:
>
> Pushed to gcc-13.

And the equivalent for gcc-14 too.


>
> -- >8 --
>
> When I tried to make the release branch versions of these docs refer to
> the release branch instead of "mainline GCC", for some reason I left the
> text "not any particular release" there. That's just confusing, because
> the docs are for a particular release, the latest on that branch. Remove
> that confusing text in several places.
>
> libstdc++-v3/ChangeLog:
>
> * doc/xml/manual/status_cxx1998.xml: Remove confusing "not in
> any particular release" text.
> * doc/xml/manual/status_cxx2011.xml: Likewise.
> * doc/xml/manual/status_cxx2014.xml: Likewise.
> * doc/xml/manual/status_cxx2017.xml: Likewise.
> * doc/xml/manual/status_cxx2020.xml: Likewise.
> * doc/xml/manual/status_cxx2023.xml: Likewise.
> * doc/xml/manual/status_cxxtr1.xml: Likewise.
> * doc/xml/manual/status_cxxtr24733.xml: Likewise.
> * doc/html/manual/status.html: Regenerate.
> ---
>  libstdc++-v3/doc/html/manual/status.html | 16 
>  libstdc++-v3/doc/xml/manual/status_cxx1998.xml   |  2 +-
>  libstdc++-v3/doc/xml/manual/status_cxx2011.xml   |  2 +-
>  libstdc++-v3/doc/xml/manual/status_cxx2014.xml   |  2 +-
>  libstdc++-v3/doc/xml/manual/status_cxx2017.xml   |  2 +-
>  libstdc++-v3/doc/xml/manual/status_cxx2020.xml   |  2 +-
>  libstdc++-v3/doc/xml/manual/status_cxx2023.xml   |  2 +-
>  libstdc++-v3/doc/xml/manual/status_cxxtr1.xml|  2 +-
>  .../doc/xml/manual/status_cxxtr24733.xml |  2 +-
>  9 files changed, 16 insertions(+), 16 deletions(-)
>
> diff --git a/libstdc++-v3/doc/html/manual/status.html 
> b/libstdc++-v3/doc/html/manual/status.html
> index 53957bd2e0d..a2c89002289 100644
> --- a/libstdc++-v3/doc/html/manual/status.html
> +++ b/libstdc++-v3/doc/html/manual/status.html
> @@ -6,7 +6,7 @@
>  This status table is based on the table of contents of ISO/IEC 14882:2003.
>  
>  This section describes the C++ support in
> -the GCC 13 release series, not in any particular release.
> +the GCC 13 release series.
>   class="title">Table 1.1. C++ 1998/2003 Implementation 
> Status summary="C++ 1998/2003 Implementation Status" border="1"> align="left" class="c1" /> class="c3" /> align="left">SectionDescription align="left">Status align="left">Comments
> 18
>
> @@ -161,7 +161,7 @@ since that release.
>  This status table is based on the table of contents of ISO/IEC 14882:2011.
>  
>  This section describes the C++11 support in
> -the GCC 13 release series, not in any particular release.
> +the GCC 13 release series.
>   class="title">Table 1.2. C++ 2011 Implementation 
> Status summary="C++ 2011 Implementation Status" border="1"> align="left" class="c1" /> class="c3" /> align="left">SectionDescription align="left">Status align="left">Comments
> 18
>
> @@ -434,7 +434,7 @@ Some subclauses are not shown in the table where the 
> content is unchanged
>  since C++11 and the implementation is complete.
>  
>  This section describes the C++14 and library TS support in
> -the GCC 13 release series, not in any particular release.
> +the GCC 13 release series.
>   class="title">Table 1.3. C++ 2014 Implementation 
> Status summary="C++ 2014 Implementation Status" border="1"> align="left" class="c1" /> class="c3" /> align="left">SectionDescription align="left">Status align="left">Comments
> 18
>
> @@ -579,7 +579,7 @@ so the API and ABI of features added in C++17 is only 
> stable
>  since that release.
>  
>  This section describes the C++17 and library TS support in
> -the GCC 13 release series, not in any particular release.
> +the GCC 13 release series.
>  
>  The following table lists new library features that are included in
>  the C++17 standard. The "Proposal" column provides a link to the
> @@ -1255,7 +1255,7 @@ options. The pre-defined symbol
>  presence of the required flag.
>  
>  This section describes the C++20 and library TS support in
> -the GCC 13 release series, not in any particular release.
> +the GCC 13 release series.
>  
>  The following table lists new library features that are included in
>  the C++20 standard. The "Proposal" column provides a link to the
> @@ -1725,7 +1725,7 @@ options. The pre-defined symbol
>  presence of the required flag.
>  
>  This section describes the C++23 and library TS support in
> -the GCC 13 release series, not in any particular release.
> +the GCC 13 release series.
>  
>  The following table lists new library features that have been accepted into
>  the C++23 working draft. The "Proposal" column provides a link to the
> @@ -2123,7 +2123,7 @@ In this implementation the header names are prefixed by
>  , and so on.
>  
>  This page describes the TR1 support in
> -the GCC 13 release series, not in any particular release.
> +the GCC 13 release series.
>   class="title">Table 1.11. C++ TR1 Implementation 
>

[PATCH] libstdc++: Add script to update docs for a new release branch

2024-06-25 Thread Jonathan Wakely
This script automates some updates that should be made when branching
from trunk. Putting them in a script makes it much easier and means I
won't forget what should be done.

Any suggestions for doing this differently?

Anything I've forgotten that should be added here?

We could add an entry to the lists of versions at
https://gcc.gnu.org/onlinedocs/libstdc++/manual/abi.html#abi.versioning.goals
but that should really be done when bumping the libtool version, not
when branching from trunk.

-- >8 --

This should be run on a release branch after branching from trunk.
Various links and references to trunk in the docs will be updated to
refer to the new release branch.

libstdc++-v3/ChangeLog:

* scripts/update_release_branch.sh: New file.
---
 libstdc++-v3/scripts/update_release_branch.sh | 14 ++
 1 file changed, 14 insertions(+)
 create mode 100755 libstdc++-v3/scripts/update_release_branch.sh

diff --git a/libstdc++-v3/scripts/update_release_branch.sh 
b/libstdc++-v3/scripts/update_release_branch.sh
new file mode 100755
index 000..f8109ed0ba3
--- /dev/null
+++ b/libstdc++-v3/scripts/update_release_branch.sh
@@ -0,0 +1,14 @@
+#!/bin/bash
+
+# This should be run on a release branch after branching from trunk.
+# Various links and references to trunk in the docs will be updated to
+# refer to the new release branch.
+
+# The major version of the new release branch.
+major=$1
+(($major)) || { echo "$0: Integer argument expected" >& 2 ; exit 1; }
+
+# This assumes GNU sed
+sed -i "s@^mainline GCC, not in any particular major.\$@the GCC ${major} 
series.@" doc/xml/manual/status_cxx*.xml
+sed -i 
's@https://gcc.gnu.org/cgit/gcc/tree/libstdc++-v3/testsuite/[^"]\+@&?h=releases%2Fgcc-'${major}@
 doc/xml/manual/allocator.xml doc/xml/manual/mt_allocator.xml
+sed -i 
"s@https://gcc.gnu.org/onlinedocs/gcc/Invoking-GCC.html@https://gcc.gnu.org/onlinedocs/gcc-${major}.1.0/gcc/Invoking-GCC.html@";
 doc/xml/manual/using.xml
-- 
2.45.2



Re: [PATCH] libstdc++: Simplify std::valarray initialization helpers

2024-06-25 Thread Jonathan Wakely
Pushed to trunk.

On Thu, 20 Jun 2024 at 16:34, Jonathan Wakely  wrote:
>
> Tested x86_64-linux.
>
> -- >8 --
>
> Dispatching to partial specializations doesn't really seem to offer much
> benefit here. The __is_trivial(T) condition is a compile-time constant
> so the untaken branches are dead code and don't cost us anything.
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/valarray_array.h (_Array_default_ctor): Remove.
> (__valarray_default_construct): Inline it into here.
> (_Array_init_ctor): Remove.
> (__valarray_fill_construct): Inline it into here.
> (_Array_copy_ctor): Remove.
> (__valarray_copy_construct(const T*, const T*, T*)): Inline it
> into here.
> (__valarray_copy_construct(const T*, size_t, size_t, T*)):
> Use _GLIBCXX17_CONSTEXPR for constant condition.
> ---
>  libstdc++-v3/include/bits/valarray_array.h | 97 --
>  1 file changed, 18 insertions(+), 79 deletions(-)
>
> diff --git a/libstdc++-v3/include/bits/valarray_array.h 
> b/libstdc++-v3/include/bits/valarray_array.h
> index 07c49ce1057..0dc28333f33 100644
> --- a/libstdc++-v3/include/bits/valarray_array.h
> +++ b/libstdc++-v3/include/bits/valarray_array.h
> @@ -62,105 +62,44 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>__valarray_release_memory(void* __p)
>{ operator delete(__p); }
>
> -  // Turn a raw-memory into an array of _Tp filled with _Tp()
> -  // This is required in 'valarray v(n);'
> -  template
> -struct _Array_default_ctor
> -{
> -  // Please note that this isn't exception safe.  But
> -  // valarrays aren't required to be exception safe.
> -  inline static void
> -  _S_do_it(_Tp* __b, _Tp* __e)
> -  {
> -   while (__b != __e)
> - new(__b++) _Tp();
> -  }
> -};
> -
> -  template
> -struct _Array_default_ctor<_Tp, true>
> -{
> -  // For trivial types, it suffices to say 'memset()'
> -  inline static void
> -  _S_do_it(_Tp* __b, _Tp* __e)
> -  { __builtin_memset(__b, 0, (__e - __b) * sizeof(_Tp)); }
> -};
> -
> +  // Turn raw-memory into an array of _Tp filled with _Tp().
> +  // This is used in `valarray v(n);` and in `valarray::shift(n)`.
>template
>  inline void
>  __valarray_default_construct(_Tp* __b, _Tp* __e)
>  {
> -  _Array_default_ctor<_Tp, __is_trivial(_Tp)>::_S_do_it(__b, __e);
> +  if _GLIBCXX17_CONSTEXPR (__is_trivial(_Tp))
> +   __builtin_memset(__b, 0, (__e - __b) * sizeof(_Tp));
> +  else
> +   while (__b != __e)
> + ::new(static_cast(__b++)) _Tp();
>  }
>
>// Turn a raw-memory into an array of _Tp filled with __t
>// This is the required in valarray v(n, t).  Also
>// used in valarray<>::resize().
> -  template
> -struct _Array_init_ctor
> -{
> -  // Please note that this isn't exception safe.  But
> -  // valarrays aren't required to be exception safe.
> -  inline static void
> -  _S_do_it(_Tp* __b, _Tp* __e, const _Tp __t)
> -  {
> -   while (__b != __e)
> - new(__b++) _Tp(__t);
> -  }
> -};
> -
> -  template
> -struct _Array_init_ctor<_Tp, true>
> -{
> -  inline static void
> -  _S_do_it(_Tp* __b, _Tp* __e, const _Tp __t)
> -  {
> -   while (__b != __e)
> - *__b++ = __t;
> -  }
> -};
> -
>template
>  inline void
>  __valarray_fill_construct(_Tp* __b, _Tp* __e, const _Tp __t)
>  {
> -  _Array_init_ctor<_Tp, __is_trivial(_Tp)>::_S_do_it(__b, __e, __t);
> +  while (__b != __e)
> +   ::new(static_cast(__b++)) _Tp(__t);
>  }
>
> -  //
>// copy-construct raw array [__o, *) from plain array [__b, __e)
> -  // We can't just say 'memcpy()'
> -  //
> -  template
> -struct _Array_copy_ctor
> -{
> -  // Please note that this isn't exception safe.  But
> -  // valarrays aren't required to be exception safe.
> -  inline static void
> -  _S_do_it(const _Tp* __b, const _Tp* __e, _Tp* __restrict__ __o)
> -  {
> -   while (__b != __e)
> - new(__o++) _Tp(*__b++);
> -  }
> -};
> -
> -  template
> -struct _Array_copy_ctor<_Tp, true>
> -{
> -  inline static void
> -  _S_do_it(const _Tp* __b, const _Tp* __e, _Tp* __restrict__ __o)
> -  {
> -   if (__b)
> - __builtin_memcpy(__o, __b, (__e - __b) * sizeof(_Tp));
> -  }
> -};
> -
>template
>  inline void
>  __valarray_copy_construct(const _Tp* __b, const _Tp* __e,
>   _Tp* __restrict__ __o)
>  {
> -  _Array_copy_ctor<_Tp, __is_trivial(_Tp)>::_S_do_it(__b, __e, __o);
> +  if _GLIBCXX17_CONSTEXPR (__is_trivial(_Tp))
> +   {
> + if (__b)
> +   __builtin_memcpy(__o, __b, (__e - __b) * sizeof(_Tp));
> +   }
> +  else
> +   while (__b != __e)
> + ::new(static_cast(__o++)) _Tp(*__b++);
>  }
>
>// copy-construct raw array [__o, *) from strided array __a[<__n :

Re: [PATCH] i386: Remove declaration of unused functions

2024-06-25 Thread Iain Sandoe



> On 25 Jun 2024, at 22:59, Evgeny Karpov  wrote:
> 
> The patch fixes the issue introduced in
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=63512c72df09b43d56ac7680cdfd57a66d40c636
> and reported at
> https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655599.html .

Trivial patches like this that fix bootstrap on multiple targets can be applied 
without extra approval,
this fixes bootstrap for x86 Darwin, so OK
Iain

> 
> Regards,
> Evgeny
> 
> 
> The patch fixes the issue with compilation on x86_64-gnu-linux
> when warnings for unused functions are treated as errors.
> 
> gcc/ChangeLog:
> 
>   * config/i386/i386.cc (legitimize_dllimport_symbol): Remove unused
>   functions.
>   (legitimize_pe_coff_extern_decl): Likewise.
> ---
> gcc/config/i386/i386.cc | 2 --
> 1 file changed, 2 deletions(-)
> 
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index aee88b08ae9..6d6a478f6f5 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -104,8 +104,6 @@ along with GCC; see the file COPYING3.  If not see
> /* This file should be included last.  */
> #include "target-def.h"
> 
> -static rtx legitimize_dllimport_symbol (rtx, bool);
> -static rtx legitimize_pe_coff_extern_decl (rtx, bool);
> static void ix86_print_operand_address_as (FILE *, rtx, addr_space_t, bool);
> static void ix86_emit_restore_reg_using_pop (rtx, bool = false);
> 
> -- 
> 2.25.1
> 



Re: [PATCH 08/11] Handle unions for CodeView.

2024-06-25 Thread Jeff Law




On 6/17/24 6:17 PM, Mark Harmstone wrote:

Translates DW_TAG_union_type DIEs into LF_UNION symbols.

 gcc/
 * dwarf2codeview.cc (write_lf_union): New function.
 (write_custom_types): Call write_lf_union.
 (add_struct_forward_def): Handle DW_TAG_union_type DIEs.
 (get_type_num_struct): Handle unions.
 (get_type_num): Handle DW_TAG_union_type DIEs.
 * dwarf2codeview.h (LF_UNION): Define.

Thanks.  I've pushed this to the trunk.

jeff



Re: [PATCH 09/11] Handle arrays for CodeView

2024-06-25 Thread Jeff Law




On 6/17/24 6:17 PM, Mark Harmstone wrote:

Translates DW_TAG_array_type DIEs into LF_ARRAY symbols.

 gcc/
 * dwarf2codeview.cc
 (struct codeview_custom_type): Add lf_array to union.
 (write_lf_array): New function.
 (write_custom_types): Call write_lf_array.
 (get_type_num_array_type): New function.
 (get_type_num): Handle DW_TAG_array_type DIEs.
 * dwarf2codeview.h (LF_ARRAY): Define.

Thanks.  I've pushed this to the trunk.
jeff



[pushed] testsuite: use check-jsonschema for validating .sarif files [PR109360]

2024-06-25 Thread David Malcolm
As reported here:
  https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655434.html
the schema validation I added for generated .sarif files in
r15-1541-ga84fe222029ff2 used the "jsonschema" command line tool, which
has been deprecated by more recent versions of the Python 3 "jsonschema"
module.

This patch updates the validation to use the more recent
"check-jsonschema" command line tool, from the Python 3 "check-jsonschema"
module, fixing the testsuite FAILs due to the deprecation message.

As an added bonus, the output on validation failures is *much* nicer, e.g.
if I undo r15-1540-g9f4fdc3acebcf6, the error messages begin like this:
verify-sarif-file: res: Schema validation errors were encountered.
  
diagnostic-format-sarif-file-bad-utf8-pr109098-1.c.sarif::$.runs[0].results[0].locations[0].physicalLocation.region.startColumn:
 0 is less than the minimum of 1
  
diagnostic-format-sarif-file-bad-utf8-pr109098-1.c.sarif::$.runs[0].results[0].relatedLocations[0].physicalLocation.region.startColumn:
 0 is less than the minimum of 1
  
diagnostic-format-sarif-file-bad-utf8-pr109098-1.c.sarif::$.runs[0].results[0].relatedLocations[1].physicalLocation.region.startColumn:
 0 is less than the minimum of 1
  
diagnostic-format-sarif-file-bad-utf8-pr109098-1.c.sarif::$.runs[0].results[0].relatedLocations[2].physicalLocation.region.startColumn:
 0 is less than the minimum of 1
child process exited abnormally
FAIL: c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-1.c  
-Wc++-compat   (test .sarif output against SARIF schema)

Tested with Python 3.8 with check_jsonschema 0.28.6

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Successful run of analyzer integration tests on x86_64-pc-linux-gnu.
Pushed to trunk as r15-1633-g17967907102099.

gcc/ChangeLog:
PR testsuite/109360
* doc/install.texi (Python3 modules): Update SARIF validation
requirement to use check-jsonschema rather than jsonschema.

gcc/testsuite/ChangeLog:
PR testsuite/109360
* lib/scansarif.exp (verify-sarif-file): Use check-jsonschema
rather than jsonschema, updating the invocation accordingly.
* lib/target-supports.exp (check_effective_target_jsonschema): Convert
to...
(check_effective_target_check_jsonschema): ...this.

Signed-off-by: David Malcolm 
---
 gcc/doc/install.texi  | 7 ---
 gcc/testsuite/lib/scansarif.exp   | 8 
 gcc/testsuite/lib/target-supports.exp | 6 +++---
 3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 0c7691651466..b54569925837 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -461,9 +461,10 @@ is shown below:
 @code{gcov}, @code{gzip}, @code{json}, @code{os} and @code{pytest}.
 
 @item SARIF testsuite
-Tests of SARIF output will use the @code{jsonschema} program from the
-@code{jsonschema} module (if available) to validate generated .sarif files.
-If this tool is not found, the validation parts of those tests are skipped.
+Tests of SARIF output will use the @code{check-jsonschema} program from
+the @code{check-jsonschema} module (if available) to validate generated
+.sarif files.  If this tool is not found, the validation parts of those
+tests are skipped.
 
 @item c++ cxx api generation
 @code{csv}, @code{os}, @code{sys} and @code{time}.
diff --git a/gcc/testsuite/lib/scansarif.exp b/gcc/testsuite/lib/scansarif.exp
index 3eb38b8102e9..cc0890ef5d8b 100644
--- a/gcc/testsuite/lib/scansarif.exp
+++ b/gcc/testsuite/lib/scansarif.exp
@@ -57,7 +57,7 @@ proc scan-sarif-file-not { args } {
 # Assuming python3 is available, use verify-sarif-file.py to check
 # that the .sarif file is UTF-8 encoded and is parseable as JSON.
 #
-# Assuming "jsonschema" is available, use it to verify that the .sarif
+# Assuming "check-jsonschema" is available, use it to verify that the .sarif
 # file complies with the SARIF schema.
 
 proc verify-sarif-file { args } {
@@ -86,8 +86,8 @@ proc verify-sarif-file { args } {
 # Verify that the file complies with the SARIF schema.
 
 # Check that jsonschema is installed.
-if { ![check_effective_target_jsonschema] } {
-   unsupported "$testcase verify-sarif-file: jsonschema is missing"
+if { ![check_effective_target_check_jsonschema] } {
+   unsupported "$testcase verify-sarif-file: check-jsonschema is missing"
return  
 }
 
@@ -95,7 +95,7 @@ proc verify-sarif-file { args } {
 verbose "schema_file: $schema_file" 2
 
 set what "$testcase (test .sarif output against SARIF schema)"
-if [catch {exec jsonschema --instance $output_file $schema_file} res ] {
+if [catch {exec check-jsonschema --schemafile $schema_file $output_file} 
res ] {
verbose "verify-sarif-file: res: $res" 2
fail "$what"
return
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index ed30cd18ad69..828c46fb15e5 100644
--- a/gcc/t

[pushed] diagnostics: eliminate various implicit uses of global_dc

2024-06-25 Thread David Malcolm
This patch eliminates all implicit uses of "global_dc" from
the path-printing logic and from
gcc_rich_location::add_location_if_nearby.

No functional change intended.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Successful run of analyzer integration tests on x86_64-pc-linux-gnu.
Pushed to trunk as r15-1634-gd681c5211e613c.

gcc/c/ChangeLog:
* c-parser.cc (c_parser_require): Pass *global_dc to
gcc_rich_location::add_location_if_nearby.

gcc/cp/ChangeLog:
* parser.cc (cp_parser_error_1): Pass *global_dc to
gcc_rich_location::add_location_if_nearby.
(cp_parser_decl_specifier_seq): Likewise.
(cp_parser_set_storage_class): Likewise.
(cp_parser_set_storage_class): Likewise.

gcc/ChangeLog:
* diagnostic-path.cc (class path_label): Add m_path field,
and use it to replace all uses of global_dc.
(event_range::event_range): Add "ctxt" param and use it to
construct m_path_label.
(event_range::maybe_add_event): Add "ctxt" param and pass it to
gcc_rich_location::add_location_if_nearby.
(path_summary::path_summary): Add "ctxt" param and pass it to
event_range::maybe_add_event.
(diagnostic_context::print_path): Pass *this to path_summary ctor.
(selftest::test_empty_path): Use "dc" when constructing
path_summary rather than implicitly using global_dc.
(selftest::test_intraprocedural_path): Likewise.
(selftest::test_interprocedural_path_1): Likewise.
(selftest::test_interprocedural_path_2): Likewise.
(selftest::test_recursion): Likewise.
(selftest::test_control_flow_1): Likewise.
(selftest::test_control_flow_2): Likewise.
(selftest::test_control_flow_3): Likewise.
(selftest::assert_cfg_edge_path_streq): Likewise.
(selftest::test_control_flow_5): Likewise.
(selftest::test_control_flow_6): Likewise.
(selftest::diagnostic_path_cc_tests): Eliminate use of global_dc.
* diagnostic-show-locus.cc
(gcc_rich_location::add_location_if_nearby): Add "ctxt" param and
use it instead of implicitly using global_dc.
(selftest::test_add_location_if_nearby): Use
test_diagnostic_context rather than implicitly using global_dc.
* diagnostic.cc (pedantic_warning_kind): Delete macro.
(permissive_error_kind): Delete macro.
(permissive_error_option): Delete macro.
(diagnostic_context::diagnostic_enabled): Remove use of
permissive_error_option.
(diagnostic_context::report_diagnostic): Remove use of
pedantic_warning_kind.
(diagnostic_impl): Convert to...
(diagnostic_context::diagnostic_impl): ...this.
(diagnostic_n_impl): Convert to...
(diagnostic_context::diagnostic_n_impl): ...this.
(emit_diagnostic): Explicitly use global_dc for method call.
(emit_diagnostic_valist): Likewise.
(emit_diagnostic_valist_meta): Likewise.
(inform): Likewise.
(inform_n): Likewise.
(warning): Likewise.
(warning_at): Likewise.
(warning_meta): Likewise.
(warning_n): Likewise.
(pedwarn): Likewise.
(permerror): Likewise.
(permerror_opt): Likewise.
(error): Likewise.
(error_n): Likewise.
(error_at): Likewise.
(error_meta): Likewise.
(sorry): Likewise.
(sorry_at): Likewise.
(fatal_error): Likewise.
(internal_error): Likewise.
(internal_error_no_backtrace): Likewise.
* diagnostic.h (diagnostic_context::diagnostic_impl): New decl.
(diagnostic_context::diagnostic_n_impl): New decl.
* gcc-rich-location.h (gcc_rich_location::add_location_if_nearby):
Add "ctxt" param.

Signed-off-by: David Malcolm 
---
 gcc/c/c-parser.cc|   2 +-
 gcc/cp/parser.cc |  11 +--
 gcc/diagnostic-path.cc   | 125 
 gcc/diagnostic-show-locus.cc |  25 +++
 gcc/diagnostic.cc| 136 +--
 gcc/diagnostic.h |   8 +++
 gcc/gcc-rich-location.h  |   6 +-
 7 files changed, 166 insertions(+), 147 deletions(-)

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index e83e9c683f7..78e53fd82ed 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -1248,7 +1248,7 @@ c_parser_require (c_parser *parser,
   bool added_matching_location = false;
   if (matching_location != UNKNOWN_LOCATION)
added_matching_location
- = richloc.add_location_if_nearby (matching_location);
+ = richloc.add_location_if_nearby (*global_dc, matching_location);
 
   if (c_parser_error_richloc (parser, msgid, &richloc))
/* If we weren't able to consolidate matching_location, then
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index e7409b856f1..3fd0c5fc5b4 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -3297,

  1   2   >