Re: [PATCH] COBOL 3/15 92K bld: config and build machinery

2025-02-17 Thread Sam James
"James K. Lowden"  writes:

> On Sat, 15 Feb 2025 21:18:50 +
> Sam James  wrote:
>
>> Please generate these files with vanilla autoconf-2.69, not
>> distro-patched autoconf.
>
> Sure thing, Sam.  I meant to do that; I thought I did. It might be that the 
> distro's autoconf still sneaked in again later.  
>
> How did you spot it?  I'd like to add an automatic check, especially before 
> submitting patches.  

Unfortunately by experience (knowing that runstatedir got backported by
many distros to 2.69). But you can copy what we do for the sourceware
buildbot by either running
https://sourceware.org/cgit/builder/tree/builder/containers/autoregen.py
and seeing if any diff is produced, or something similar.


Re: [PATCH] COBOL 12/15 24K pos: Posix adapter framework

2025-02-17 Thread James K. Lowden
On Sat, 15 Feb 2025 21:24:52 +
Sam James  wrote:

> > +prototypes.cpp: posix.txt
> > +   awk -F'[/.]' '{ print $$6 }' $^ | \
> > +   while read F; do echo "/* $$F */" && man 2 $$F | \
> > +   ./scrape.awk -v funcname=$$6; done > $@~
> > +   @mv $@~ $@
> > +
> > +posix.txt:
> > +   zgrep -l 'POSIX[.]' /usr/share/man/man2/*z > $@~
> 
> This will need reworking. It assumes the location of the man pages on
> the system, assumes 'zgrep' exists, and assumes 'zgrep' can read the
> man pages (the man pages may be compressed with something else; I know
> such systems exist).
> 
> I'm not sure this is really any less brittle or more robust than just
> listing the actual functions you scraped out from your system.

You might be reading more into this than you want to.  

As you saw in gcc/cobol/posix/README.md, the files in that directory are not 
part of the compiler.  They are tools we provide that potentially make it 
easier to generate user-defined COBOL functions that call functions in the C 
standard library, in particular syscalls.  IMO they don't need to be perfect; 
it is enough that they are good.  

The user need never touch this part of the system.  The compiler functions 
without it.  It's there as a convenience and demonstration.  I hope to 
encourage contributions from users to this directory in a "contrib/" kind of 
way.  

There are dependencies beyond the ones you mention, not least (as documented) 
the Python PLY module.  Anyone sitting down with this tool will have to wrestle 
with it a bit.  I contend that, if the user needs more than a few functions, it 
will be less trouble to engage the tool than to write them by hand.  

I agree it could be improved.  For example, 

> +posix.txt:
> + zgrep -l 'POSIX[.]' /usr/share/man/man2/*z > $@~

could be

posix.txt:
$(ZGREP) -l 'POSIX[.]' $(MANDIR)/man/man2/*z > $@~

but that doesn't gain us much, does it?  We could start over with autoconf & 
automake, to ensure full portability.  But that would defeat the purpose.  What 
I want to provide here is a prototype, not a robust foolproof tool.  

I think a simple example -- even a brittle one loaded with assumptions -- is 
easier to understand and serves as a better illustration than a complicated 
one.  I want to provide such a tool as part of gcobol, to give the user a 
facility not available from any other COBOL compiler.  I think it's better 
included in the gcc distribution than as an SO post or FAQ at 
http://www.cobolworx.com.  

I'm sure you agree we don't want to let this tail wag the dog.  With my 
exegesis in mind, what would you recommend?  If it's limited to more judicious 
use of makefile variables, I could surely implement those suggestions.  

--jkl



Re: [PATCH] COBOL 8/15 360K cbl: parser support

2025-02-17 Thread James K. Lowden
On Sat, 15 Feb 2025 23:37:20 -0500
David Malcolm  wrote:

> +const char *
> +cobol_get_sarif_source_language(const char *)
> +{
> +return "cobol";
> +}
> 
> Out of curiosity, did you try the SARIF output?  This is a good test
> for whether you?re properly using the GCC diagnostics subsystem.

How do I do that?  I barely know the term; I have to look it up every time.  I 
don't find "sarif" anywhere in gcc.info or gccint.info.  

No objection, just flummoxed.  

--jkl



[PATCH v2 09/16] Add assembler_name to cgraph_function_version_info.

2025-02-17 Thread Alfie Richards

This adds the assembler_name member to cgraph_function_version_info
to store the base assembler name for the function to be mangled. This is
used in later patches for refactoring FMV mangling.

gcc/ChangeLog:

* cgraph.cc (cgraph_node::insert_new_function_version): Record
assembler_name.
* cgraph.h (struct cgraph_function_version_info): Add assembler_name.
---
 gcc/cgraph.cc | 1 +
 gcc/cgraph.h  | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
index bf6b43d00db..984ddfadfff 100644
--- a/gcc/cgraph.cc
+++ b/gcc/cgraph.cc
@@ -187,6 +187,7 @@ cgraph_node::insert_new_function_version (void)
   version_info_node = NULL;
   version_info_node = ggc_cleared_alloc ();
   version_info_node->this_node = this;
+  version_info_node->assembler_name = DECL_ASSEMBLER_NAME (this->decl);
 
   if (cgraph_fnver_htab == NULL)
 cgraph_fnver_htab = hash_table::create_ggc (2);
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 065fcc742e8..d9177364b7a 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -856,6 +856,9 @@ struct GTY((for_user)) cgraph_function_version_info {
  dispatcher. The dispatcher decl is an alias to the resolver
  function decl.  */
   tree dispatcher_resolver;
+
+  /* The assmbly name of the function set before version mangling.  */
+  tree assembler_name;
 };
 
 #define DEFCIFCODE(code, type, string)	CIF_ ## code,


Re: [PATCH] COBOL 12/15 24K pos: Posix adapter framework

2025-02-17 Thread Sam James
"James K. Lowden"  writes:

> On Sat, 15 Feb 2025 21:24:52 +
> Sam James  wrote:
>
>> > +prototypes.cpp: posix.txt
>> > +  awk -F'[/.]' '{ print $$6 }' $^ | \
>> > +  while read F; do echo "/* $$F */" && man 2 $$F | \
>> > +  ./scrape.awk -v funcname=$$6; done > $@~
>> > +  @mv $@~ $@
>> > +
>> > +posix.txt:
>> > +  zgrep -l 'POSIX[.]' /usr/share/man/man2/*z > $@~
>> 
>> This will need reworking. It assumes the location of the man pages on
>> the system, assumes 'zgrep' exists, and assumes 'zgrep' can read the
>> man pages (the man pages may be compressed with something else; I know
>> such systems exist).
>> 
>> I'm not sure this is really any less brittle or more robust than just
>> listing the actual functions you scraped out from your system.
>
> You might be reading more into this than you want to.  

Ah, that's a fair point.

> I'm sure you agree we don't want to let this tail wag the dog.  With my 
> exegesis in mind, what would you recommend?  If it's limited to more 
> judicious use of makefile variables, I could surely implement those 
> suggestions.  

I think it's fine as-is, but I'd ask that you keep in mind that it's
very possible there's entries missing from it even *with* this, and that
having a script like this at all may give a false sense of completeness
or correctness.

(Not all of the functions have a distinct page, for example, and not
everything is documented. There's also a distinct man-pages-posix
collection which you may or may not have installed.)

But fair enough.

thanks,
sam


Re: [PATCH] COBOL 6/15 156K lex: lexer

2025-02-17 Thread James K. Lowden
On Sat, 15 Feb 2025 23:32:37 -0500
David Malcolm  wrote:

In defense of lack of free(3) ...

> > +const char *
> > +esc( size_t len, const char input[] ) {
> > +  static char spaces[] = "([,;]?[[:space:]])+";
> > +  static char spaceD[] = "(\n {6}D" "|" "[,;]?[[:space:]])+";
> > +  static char buffer[64 * 1024];
> > +  char *p = buffer;
> > +  const char *eoinput = input + len;
> > +
> > +  const char *spacex = is_reference_format()? spaceD : spaces;
> > +
> > +  for( const char *s=input; *s && s < eoinput; s++ ) {
> > +    *p = '\0';
> > +    gcc_assert( size_t(p - buffer) < sizeof(buffer) - 4 );

overflow guarded here

> > +    switch(*s) {
> > +    case '^': case '$':
> > +    case '(': case ')':
> > +    case '*': case '+': case '?':
> > +    case '[': case ']':
> > +    case '{': case '}':
> > +    case '|':
> > +    case '.':
> > +  *p++ = '\\';
> > +  *p++ = *s;
> > +  break;
> > +    case '\\':
> > +  *p++ = '[';
> > +  *p++ = *s;
> > +  *p++ = ']';
> > +  break;
> > +
> > +    case ';': case ',':
> > +  if( ! (s+1 < eoinput && s[1] == 0x20) ) {
> > +    *p++ = *s;
> > +    break;
> > +  }
> > +  __attribute__((fallthrough));
> > +    case 0x20: case '\n':
> > +  gcc_assert(p + sizeof(spacex) < buffer + sizeof(buffer));

and overflow guarded here, the only place where more than 4 characters can be 
inserted into the buffer.  

> > +  p = stpcpy( p, spacex );
> > +  while( s+1 < eoinput && is_separator_space(s+1)) {
> > +    s++;
> > +  }
> > +  break;
> > +    default:
> > +  *p++ = *s;
> > +  break;
> > +    }
> > +  }
> > +  *p = '\0';
...
> > +  return xstrdup(buffer);
> > +}
> 
> Has a fixed size 64k buffer; doesn't seem to have proper overflow
> handling. Could use a pretty_printer to accumulate chars.

Thank you for these comments.  Let me see if I can alleviate your concerns.  

This function is called from exactly one place, where the file-reader, lexio, 
parses a REPLACE directive.  The COBOL input says "REPLACE X BY Y" subject to 
some constraints.  Because we're using regex to find X, and because X might be 
any arbitrary string, the esc() function escapes regex metacharacters prior to 
executing the regex.  

IMHO it's unlikely the resulting regex input will exceed 64 KB.  It's unlikely 
to be even 64 bytes.  (Usually X is a COBOL identifier, limited to 64 bytes by 
ISO.)  Granted, a fixed maximum is a limitation.  But I put it to you: which is 
more likely?  For a regex to exceed 64 KB, or for heap allocation to fail to 
return adequate memory?  If there's some crazy input, I'd rather die at 64 KB 
than consume gigabytes of swap on the way to crashing.  

(As a matter of fact, the function returns a copy of the used portion of the 
static buffer, an unnecessary allocation.  The caller soon replaces it, and 
could have used a pointer into that static buffer.)  

If there is a built-in function in gcc to escape a regex, that would be 
preferable to this gnarly code.  Is that what you mean by "pretty printer"?  

> Returns a allocated buffer as "const char *", which should be just a
> "char  *".

The caller has no business writing to the allocated buffer, which is input to 
the regex call.  Caller and called agree on const.  Isn't that what "const" is 
for?  

--jkl


Re: [PATCH] COBOL 5/15 380K hdr: header files

2025-02-17 Thread Sam James
"James K. Lowden"  writes:

> On Sat, 15 Feb 2025 21:30:16 +
> Sam James  wrote:
>
>> > + * This stand-in for std::regex was written because the
>> > implementation provided
>> > + * by the GCC libstdc++ in GCC 11 proved too slow, where "slow"
>> > means "appears
>> > + * not to terminate".  Some invocations of std::regex_search took
>> > over 5
>> 
>> Is this still the case now in GCC trunk (15)? Is there a bug report to
>> link to in the comment if so?
>
> I didn't pursue a bug report for this problem, so as not to try to boil the 
> ocean. 
>
> AFAIK, the poor performance of std::regex is widely acknowledged and is 
> somehow a feature of how it's defined.  Jonathan Wakely understands the 
> problem better than I do.  
>
> Although under no obligation to use std::regex, I thought I'd try it
> out and, honestly, it's not a bad interface.  But the performance was
> awful.  It was easy to re-implement what I needed from std::regex in
> terms of regex(3), and left the door open to revert simply by changing
> "using namespace dts".
>
> Is the state of gcc-15 relevant, though?  gcc is frequently built
> using whatever C++ compiler is installed.  If my understanding is
> correct, to rely on the installed std::regex is just to set a trap for
> the user.

As richi said, COBOL isn't included as a stage 1 language, so the stage1
compiler is irrelevant really.

>
> --jkl


Re: [PATCH] COBOL 5/15 380K hdr: header files

2025-02-17 Thread James K. Lowden
On Sat, 15 Feb 2025 21:30:16 +
Sam James  wrote:

> > +cbl_refer_t *
> > +negate( cbl_refer_t * refer, bool neg = true ) {
> > +  if( ! neg ) return refer;
> > +  assert( is_numeric(refer->field) );
> 
> These should be gcc_assert or gcc_checking_assert in general,
> depending on the severity (do you want to always assert the
> invariant, or is it okay to only do it for debugging ('checking')
> builds of the compiler)?

I would like to explain what you're seeing, and hope to convince you it's for 
the best.  

There are two issues here, the form of the assert, and the purpose.  

By design, the lex & yacc parts of the COBOL front end are pure Posix.  
Originally they didn't use any gcc header files.  The reader of these files 
should be able to read them with no knowledge of gcc internals.  After all, 
that reader was once me.  :-)

In addition to being bog-standard C, assert(3) does something gcc_assert does 
not: it prints the text of the asserted condition.  That plus the stack trace 
is more interesting to me than the source line.  I don't remember the last time 
I cared about the COBOL input to track down an invariant.  

That brings us to the purpose.  This assert and many others are guarded by the 
yacc grammar.  Before the parser calls cbl_refer_t::negate, it should have 
ensured that the input is numeric, because non-numeric input is a semantic 
error (not syntax error) in the COBOL that the parser reports through the 
diagnositic framework.  IOW, we're only asserting that the parser did its job 
correctly.  

cscope reports 465 uses of assert(3) in our front end.  I honestly believe gcc 
is best off leaving them exactly as they are.  They are an aid to the 
developer, mostly to enforce preconditions.  The user should never see the 
message and if he does no amount of friendliness is any help.  A developer will 
have to correct the logic error.  

--jkl


Re: [PATCH] COBOL 6/15 156K lex: lexer

2025-02-17 Thread Sam James
"James K. Lowden"  writes:

> On Sat, 15 Feb 2025 23:32:37 -0500
> David Malcolm  wrote:
>
> In defense of lack of free(3) ...
>
>> > +const char *
>> > +esc( size_t len, const char input[] ) {
>> > +  static char spaces[] = "([,;]?[[:space:]])+";
>> > +  static char spaceD[] = "(\n {6}D" "|" "[,;]?[[:space:]])+";
>> > +  static char buffer[64 * 1024];
>> > +  char *p = buffer;
>> > +  const char *eoinput = input + len;
>> > +
>> > +  const char *spacex = is_reference_format()? spaceD : spaces;
>> > +
>> > +  for( const char *s=input; *s && s < eoinput; s++ ) {
>> > +    *p = '\0';
>> > +    gcc_assert( size_t(p - buffer) < sizeof(buffer) - 4 );
>
> overflow guarded here
>
>> > +    switch(*s) {
>> > +    case '^': case '$':
>> > +    case '(': case ')':
>> > +    case '*': case '+': case '?':
>> > +    case '[': case ']':
>> > +    case '{': case '}':
>> > +    case '|':
>> > +    case '.':
>> > +  *p++ = '\\';
>> > +  *p++ = *s;
>> > +  break;
>> > +    case '\\':
>> > +  *p++ = '[';
>> > +  *p++ = *s;
>> > +  *p++ = ']';
>> > +  break;
>> > +
>> > +    case ';': case ',':
>> > +  if( ! (s+1 < eoinput && s[1] == 0x20) ) {
>> > +    *p++ = *s;
>> > +    break;
>> > +  }
>> > +  __attribute__((fallthrough));
>> > +    case 0x20: case '\n':
>> > +  gcc_assert(p + sizeof(spacex) < buffer + sizeof(buffer));
>
> and overflow guarded here, the only place where more than 4 characters can be 
> inserted into the buffer.  
>
>> > +  p = stpcpy( p, spacex );
>> > +  while( s+1 < eoinput && is_separator_space(s+1)) {
>> > +    s++;
>> > +  }
>> > +  break;
>> > +    default:
>> > +  *p++ = *s;
>> > +  break;
>> > +    }
>> > +  }
>> > +  *p = '\0';
> ...
>> > +  return xstrdup(buffer);
>> > +}
>> 
>> Has a fixed size 64k buffer; doesn't seem to have proper overflow
>> handling. Could use a pretty_printer to accumulate chars.
>
> Thank you for these comments.  Let me see if I can alleviate your concerns.  
>
> This function is called from exactly one place, where the file-reader,
> lexio, parses a REPLACE directive.  The COBOL input says "REPLACE X BY
> Y" subject to some constraints.  Because we're using regex to find X,
> and because X might be any arbitrary string, the esc() function
> escapes regex metacharacters prior to executing the regex.
>
> IMHO it's unlikely the resulting regex input will exceed 64 KB.  It's
> unlikely to be even 64 bytes.  (Usually X is a COBOL identifier,
> limited to 64 bytes by ISO.)  Granted, a fixed maximum is a
> limitation.  But I put it to you: which is more likely?  For a regex
> to exceed 64 KB, or for heap allocation to fail to return adequate
> memory?  If there's some crazy input, I'd rather die at 64 KB than
> consume gigabytes of swap on the way to crashing.
>
> (As a matter of fact, the function returns a copy of the used portion
> of the static buffer, an unnecessary allocation.  The caller soon
> replaces it, and could have used a pointer into that static buffer.)
>
> If there is a built-in function in gcc to escape a regex, that would be 
> preferable to this gnarly code.  Is that what you mean by "pretty printer"?  
>
>> Returns a allocated buffer as "const char *", which should be just a
>> "char  *".
>
> The caller has no business writing to the allocated buffer, which is input to 
> the regex call.  Caller and called agree on const.  Isn't that what "const" 
> is for?  

If it's allocated, it should be freed at some point. If it's const,
freeing it looks wrong.

>
> --jkl


[pushed] c++: add fixed test [PR102455]

2025-02-17 Thread Marek Polacek
Tested x86_64-pc-linux-gnu, applying to trunk.

-- >8 --
Fixed by r13-4564 but the tests are very different.

PR c++/102455

gcc/testsuite/ChangeLog:

* g++.dg/ext/vector43.C: New test.
---
 gcc/testsuite/g++.dg/ext/vector43.C | 7 +++
 1 file changed, 7 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/vector43.C

diff --git a/gcc/testsuite/g++.dg/ext/vector43.C 
b/gcc/testsuite/g++.dg/ext/vector43.C
new file mode 100644
index 000..6efbe0ff197
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/vector43.C
@@ -0,0 +1,7 @@
+// PR c++/102455
+// { dg-do compile { target c++14 } }
+
+typedef int v4si;
+typedef float v4sf __attribute__ ((vector_size(4)));
+constexpr v4sf foo (v4si a) { return (v4sf)a;}
+template  constexpr v4sf b = foo (v4si {});

base-commit: 720c8f685210af9fc9c31810e224751102f1481e
-- 
2.48.1



Re: [PATCH] COBOL 8/15 360K cbl: parser support

2025-02-17 Thread James K. Lowden
On Sat, 15 Feb 2025 23:37:20 -0500
David Malcolm  wrote:

> +  rich_location richloc (line_table, token_location);
> +  bool ret = global_dc->diagnostic_impl (&richloc, nullptr,
> option_id,
> + gmsgid, &ap, DK_ERROR);
> +  va_end (ap);
> +  global_dc->end_group();
> +}
> 
> For errors, just pass 0 as the diagnostic_option_id.  Same for the
> various DK_SORRY and DK_FATAL.

OK, but is this a style thing?  That's effectively what happens, using a name.  

option_id is a file-scope static constant, initialized to 0.  Instead of 
passing an integer that the compiler uses to construct a temporary 
diagnostic_option_id, we pass an already-constructed diagnostic_option_id by 
value.  

(Maybe zero_option_id would be a better name?)

> +bool
> +yywarn( const char gmsgid[], ... ) {
> +  verify_format(gmsgid);
> +  auto_diagnostic_group d;
> +  va_list ap;
> +  va_start (ap, gmsgid);
> +  auto ret = emit_diagnostic_valist( DK_WARNING, token_location,
> + option_id, gmsgid, &ap );
> +  va_end (ap);
> +  return ret;
> +}
> 
> For warnings, ideally this should take a diagnostic_option_id
> controlling the warning as the initial parameter, rather than have a
> global variable for this.  

Yes, absolutely.  That's on the to do list.  I wanted to get a set of patches 
submitted for consideration, and drew the line ahead of that item.  

> Is this something that yacc is imposing on you?

Not at all.  I need to go into gcc/cobol/lang.opt and enumerate the warnings.  
Then I need to pass the warning ID into yywarn (which will be renamed 
warn_msg() because the "yy" prefix is properly reserved for yacc).  

As we say, just a small matter of programming.  :-) 

--jkl


Re: [PATCH] COBOL 6/15 156K lex: lexer

2025-02-17 Thread James K. Lowden
On Sat, 15 Feb 2025 23:32:37 -0500
David Malcolm  wrote:

> > +  free(copier);
> 
> There?s a manual free of "copier" here, but there?s are various error-
> handling early returns paths that will leak. Maybe just use a
> std::string?
> 
> Similarly with ?path?; I think this is always leaked. Maybe
> std::string here too.
> 
> Have you tried running the compiler under valgrind?  Configure with
> ?enable-valgrind-annotations and pass -wrap per=valgrind to the
> driver.

It's no accident, comrade.  ;-) 

My design criterion: the parser's memory requirements are linear with the 
input.  As grows the COBOL text, so grows the memory consumption.  Any more 
would be wrong; any less is pointless.  

The parser makes a single pass over the input.  It can't "leak" except in a 
loop.  In general it therefore never calls free(3).  Only when a string is 
being built up of consecutive allocations in a loop do we take any care with 
free.  

Before anyone suggests that's wasteful of memory, let's remind ourselves of the 
compiler's design.  The input text is transformed into a GENERIC tree.  The 
compiled program *must* fit in memory.  Much of that input needs to be retained 
as debug strings and supplied values.  

We have tested gcobol on large COBOL inputs, hundreds of thousands of lines.  
gcobol compiles those programs, as is, without freeing memory, on virtual 
machines that can barely link cc1.  

In the instant case, when open_file() fails, the compiler will soon terminate 
for lack of input.  Code generation will not be engaged.  The lost strings are 
literally no concern at all.  (I can see why that might not be obvious on first 
reading, and I hope I don't sound harsh or dismissive.)  

As for std::string, it would only complicate the front end.  Most strings in 
the parser are either fed back in to some C API or become parts of GENERIC 
nodes.  I admit having been tempted by std::stringstream for concatenation but 
in the end asprintf did most of what was needed.  

> Have you tried running the compiler under valgrind?  Configure with
> ?enable-valgrind-annotations and pass -wrap per=valgrind to the driver.

We have not tried that incantation, no.  We used valgrind for corruption 
problems, both times.  ;-)   We measured performance and memory use with the 
excellent Linux perf tool.  That is how we found the astonishing problems with 
std::regex (and also with how we were using it).  

I hope you see that we're taking care, but not too much care, with memory.  If 
there is a loop that is missing a free, we'll fix it, but I bet it won't 
matter, because these are just string fragments for filenames and such.  To 
dutifully call free for every allocated scrap of parsed input would be 
counterproductive: error-prone, and little if anything saved.  

--jkl


Re: [PATCH] COBOL 9/15 532K api: GENERIC interface

2025-02-17 Thread Richard Biener
On Sun, Feb 16, 2025 at 9:20 PM Robert Dubner  wrote:
>
>
>
> > -Original Message-
> > From: David Malcolm 
> > Sent: Saturday, February 15, 2025 23:39
> > To: James K. Lowden ; gcc-patches@gcc.gnu.org
> > Subject: Re: [PATCH] COBOL 9/15 532K api: GENERIC interface
> >
> > On Sat, 2025-02-15 at 16:02 -0500, James K. Lowden wrote:
> > > From 5d53920602e234e4d99ae2d502e662ee3699978e 4 Oct 2024 12:01:22 -
> > > 0400
> > > From: "James K. Lowden" 
> > > Date: Sat 15 Feb 2025 12:50:53 PM EST
> > > Subject: [PATCH] 2 new 'cobol' FE files
> > >
> > > gcc/cobol/ChangeLog
> > > * genapi.cc: New file.
> > > * genapi.h: New file.
> > >
> >
> > +static tree label_list_out_goto;
> > +static tree label_list_out_label;
> > +static tree label_list_back_goto;
> > +static tree label_list_back_label;
> >
> > Any time we have a static or extern tree, I wonder if it should have a
> > GTY(()) marker to mark it as a garbage collection root.  Have you stress-
> > tested this with the params that force a collection on every GC
> > opportunity?
> >
> > +static std::map >
> > +call_targets; static std::map
> > +called_targets;
> >
> > Similarly here, but the trees in question are deep within global data
> > structures that don’t know about the GC.
> >
> > Hopefully the GC can never run during the times when these structures are
> > live,otherwise you’ve got problems…
>
> Okay.  I simply do not pretend to understand garbage collection in GCC.  I
> hope nobody disagrees when I state that the documentation in
> gcc_internals_15.pdf is sparse.
>
> I have run the cobol compiler's entire existing suite of test programs --
> there are about 1,000 of them -- with the options
>
> --param=ggc-min-expand=0 --param=ggc-min-heapsize=0
>
> No errors were observed.
>
> The internals document has section " 23.6 How to invoke the garbage
> collector" wherein appears the statement "So the only way to have GGC
> reclaim storage is to call the ggc_
> collect function explicitly."
>
> Well, *I* am certainly not calling ggc_collect.  So, I have to ask:  When
> might the garbage collector run?  Am I supposed to be calling ggc_collect()?

You are not required to call ggc_collect (), most frontends don't do
that directly
unless they create a lot of garbage (C++ for example does).
But once control hands off to the middle-end (when the parse_file () langhook
finished) we eventually do.  The risk then is that via invocation of a langhook
the frontend looks at its data structures which might have been GCed.  I've
skimmed yours and they look like they do not have any issue.

So I'd say you're fine at the moment.

Richard.

> >
> >
> > +  IF( left_side, lt_op, gg_cast(TREE_TYPE(left_side),
> > integer_zero_node) )
> > +{
> > +if( debugging )
> > +  {
> > +  gg_printf("normal_normal_compare(): different types returning -
> > 1\n",
> > +NULL_TREE);
> > +  }
> > +gg_assign( return_int, integer_minusone_node);
> > +}
> > +  ELSE
> >
> > What’s with the uppercase IF and ELSE?  Is this just regular control flow
> > or is some kind of preprocessor magic going on?
>
> They are sanity-preserving "preprocesser magic" to implement the GENERIC for
> conditional processing.  The definitions are found in gengen.h:
>
> #define IF(a,b,c) gg_if((a),(b),(c));
> #define ELSE current_function->statement_list_stack.pop_back();
> #define ENDIF current_function->statement_list_stack.pop_back();
>
> The function gg_if, in turn, builds a relational expression and puts the two
> needed statement lists -- the first for "when true", the second for "when
> false" -- onto a stack so that expressions can easily be appended to them.
>
> So, if you mean by "regular control flow" ordinary C/C++ if/else processing,
> no.  They implement GENERIC conditional processing.
>
>
> >
> > +  if( paragraph )
> > +{
> > +sprintf(ach, "%s", paragraph);
> > +strcat(retval, ach);
> > +}
> > +  strcat(retval, ".");
> > +  if( section )
> > +{
> > +sprintf(ach, "%s", section);
> > +strcat(retval, ach);
> > +}
> > +  strcat(retval, ".");
> > +  if( mangled_program_name )
> > +{
> > +strcat(retval, mangled_program_name);
> > +}
> > +  sprintf(ach, ".%ld", current_function->program_id_number);
> > +  strcat(retval, ach);
> > +  sprintf(ach, ".%ld", deconflictor);
> > +  strcat(retval, ach);
> >
> > “ach” and “retval” are fixed-sized buffers; is all this string
> > manipulation guaranteed to fit?  Similarly in assembler_label.
>
> Although there is a bit of faith involved, yes, I long ago did the
> appropriate calculations to "guarantee"  that the strings involved
> will fit.  COBOL identifiers are limited to 63 characters.
>
> That said, there are pathological name-mangling cases that could cause those
> limits to be exceeded.  So, I will rewrite that code so as not to depend on
> magically-sized fixed-length buffers.
>
> [Having mentioned the pathology, I feel compelled to 

[PATCH] tree-optimization/118895 - ICE during PRE

2025-02-17 Thread Richard Biener
When we simplify a NARY during PHI translation we have to make sure
to not inject not available operands into it given that might violate
the valueization hook constraints and we'd pick up invalid
context-sensitive data in further simplification or as in this case
later ICE when we try to insert the expression.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/118895
* tree-ssa-sccvn.cc (vn_nary_build_or_lookup_1): Only allow
CSE if we can verify the result is available.

* gcc.dg/pr118895.c: New testcase.
---
 gcc/testsuite/gcc.dg/pr118895.c | 13 +
 gcc/tree-ssa-sccvn.cc   | 13 -
 2 files changed, 21 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr118895.c

diff --git a/gcc/testsuite/gcc.dg/pr118895.c b/gcc/testsuite/gcc.dg/pr118895.c
new file mode 100644
index 000..ca61d2cc1b1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr118895.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+unsigned long a;
+void fn1()
+{
+  unsigned long e = a;
+  int c = e;
+  int d = c < 100 ? c : 0;
+  if (d + (int)e & 608)
+while (e & 608)
+  e <<= 1;
+}
diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index 8bb45780a98..146840664e2 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -366,6 +366,10 @@ static vn_ssa_aux_t last_pushed_avail;
correct.  */
 static vn_tables_t valid_info;
 
+/* Global RPO state for access from hooks.  */
+static class eliminate_dom_walker *rpo_avail;
+basic_block vn_context_bb;
+
 
 /* Valueization hook for simplify_replace_tree.  Valueize NAME if it is
an SSA name, otherwise just return it.  */
@@ -2501,7 +2505,10 @@ vn_nary_build_or_lookup_1 (gimple_match_op *res_op, bool 
insert,
   bool res = false;
   if (i == res_op->num_ops)
 {
-  mprts_hook = vn_lookup_simplify_result;
+  /* Do not leak not available operands into the simplified expression
+when called from PRE context.  */
+  if (rpo_avail)
+   mprts_hook = vn_lookup_simplify_result;
   res = res_op->resimplify (NULL, vn_valueize);
   mprts_hook = NULL;
 }
@@ -2684,10 +2691,6 @@ public:
   vn_avail *m_avail_freelist;
 };
 
-/* Global RPO state for access from hooks.  */
-static eliminate_dom_walker *rpo_avail;
-basic_block vn_context_bb;
-
 /* Return true if BASE1 and BASE2 can be adjusted so they have the
same address and adjust *OFFSET1 and *OFFSET2 accordingly.
Otherwise return false.  */
-- 
2.43.0


Re: [PATCH v2] ira: Add a target hook for callee-saved register cost scale

2025-02-17 Thread Jan Hubicka
> Jan Hubicka  writes:
> >> As described below, the patch also shows no change to AArch64 SPEC2017
> >> scores.  I'm afraid I'll need help from x86 folks to do performance
> >> testing there.
> >
> > I will look into this over weekend. I can write x86 version of the
> > hooks. Though in earlier email you mentioned you hacked up something, so
> > if you can share it with me, perhaps I can start from there.
> 
> Sorry, was away until today.  But the earlier hacked up version was
> essentially what I included in this patch.

I added few special cases to that (since we don't always push) but
overall it looks good to me.  I hope to have benchmarks done today
(I was benchmarking some other patches over weekend).

Honza
> 
> Thanks,
> Richard


[PATCH v2 15/16] Add error cases and tests for Aarch64 FMV.

2025-02-17 Thread Alfie Richards

This changes the ambiguation error for C++ to cover cases of differently
annotated FMV function sets whose signatures only differ by their return
type.

It also adds tests covering many FMV errors for Aarch64, including
redeclaration, and mixing target_clones and target_versions.

gcc/cp/ChangeLog:

* decl.cc (duplicate_decls): Change logic to not always exclude FMV
annotated functions in cases of return type non-ambiguation.

gcc/testsuite/ChangeLog:

* g++.target/aarch64/mv-and-mvc-error1.C: New test.
* g++.target/aarch64/mv-and-mvc-error2.C: New test.
* g++.target/aarch64/mv-and-mvc-error3.C: New test.
* g++.target/aarch64/mv-error1.C: New test.
* g++.target/aarch64/mv-error2.C: New test.
* g++.target/aarch64/mv-error3.C: New test.
* g++.target/aarch64/mv-error4.C: New test.
* g++.target/aarch64/mv-error5.C: New test.
* g++.target/aarch64/mv-error6.C: New test.
* g++.target/aarch64/mv-error7.C: New test.
* g++.target/aarch64/mv-error8.C: New test.
* g++.target/aarch64/mvc-error1.C: New test.
* g++.target/aarch64/mvc-error2.C: New test.
---
 gcc/cp/decl.cc|  7 +--
 .../g++.target/aarch64/mv-and-mvc-error1.C| 10 +
 .../g++.target/aarch64/mv-and-mvc-error2.C| 10 +
 .../g++.target/aarch64/mv-and-mvc-error3.C|  9 
 gcc/testsuite/g++.target/aarch64/mv-error1.C  | 19 +
 gcc/testsuite/g++.target/aarch64/mv-error2.C  | 10 +
 gcc/testsuite/g++.target/aarch64/mv-error3.C  | 13 
 gcc/testsuite/g++.target/aarch64/mv-error4.C  | 10 +
 gcc/testsuite/g++.target/aarch64/mv-error5.C  |  9 
 gcc/testsuite/g++.target/aarch64/mv-error6.C  | 21 +++
 gcc/testsuite/g++.target/aarch64/mv-error7.C  | 12 +++
 gcc/testsuite/g++.target/aarch64/mv-error8.C  | 13 
 gcc/testsuite/g++.target/aarch64/mvc-error1.C | 10 +
 gcc/testsuite/g++.target/aarch64/mvc-error2.C | 10 +
 14 files changed, 161 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-and-mvc-error1.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-and-mvc-error2.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-and-mvc-error3.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-error1.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-error2.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-error3.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-error4.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-error5.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-error6.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-error7.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-error8.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mvc-error1.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mvc-error2.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 18a03a88aa4..fc751efd117 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -2012,8 +2012,11 @@ duplicate_decls (tree newdecl, tree olddecl, bool hiding, bool was_hidden)
 	}
 	  /* For function versions, params and types match, but they
 	 are not ambiguous.  */
-	  else if ((!DECL_FUNCTION_VERSIONED (newdecl)
-		&& !DECL_FUNCTION_VERSIONED (olddecl))
+	  else if (((!DECL_FUNCTION_VERSIONED (newdecl)
+		 && !DECL_FUNCTION_VERSIONED (olddecl))
+		|| !comptypes (TREE_TYPE (TREE_TYPE (newdecl)),
+   TREE_TYPE (TREE_TYPE (olddecl)),
+   COMPARE_STRICT))
 		   /* Let constrained hidden friends coexist for now, we'll
 		  check satisfaction later.  */
 		   && !member_like_constrained_friend_p (newdecl)
diff --git a/gcc/testsuite/g++.target/aarch64/mv-and-mvc-error1.C b/gcc/testsuite/g++.target/aarch64/mv-and-mvc-error1.C
new file mode 100644
index 000..19965dca418
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/mv-and-mvc-error1.C
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O0" } */
+/* { dg-additional-options "-Wno-experimental-fmv-target" } */
+
+__attribute__ ((target_version ("dotprod"))) float
+foo () { return 3; } /* { dg-message "previously defined here" } */
+
+__attribute__ ((target_clones ("dotprod", "mve"))) float
+foo () { return 3; } /* { dg-error "redefinition of" } */
diff --git a/gcc/testsuite/g++.target/aarch64/mv-and-mvc-error2.C b/gcc/testsuite/g++.target/aarch64/mv-and-mvc-error2.C
new file mode 100644
index 000..df048260a90
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/mv-and-mvc-error2.C
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O0" } */
+/* { dg-additional-options "-Wno-experimental-fmv-target" } */
+
+__attribute__ ((target_version ("default"))) int
+foo () { return 1; } /* { dg-message "old declaration" } */
+
+__attribute__ ((target_clones ("

[PATCH v2 04/16] Remove unnecessary `record` argument from maybe_version_functions.

2025-02-17 Thread Alfie Richards

Previously, the `record` argument in maybe_version_function allowed the
call to cgraph_node::record_function_versions to be skipped.  However,
this was only skipped when both decls were already marked as versioned,
in which case we trigger the early exit in record_function_versions
instead. Therefore, the argument is unnecessary.

gcc/cp/ChangeLog:

* class.cc (add_method): Remove argument.
* cp-tree.h (maybe_version_functions): Ditto.
* decl.cc (decls_match): Ditto.
(maybe_version_functions): Ditto.
---
 gcc/cp/class.cc  | 2 +-
 gcc/cp/cp-tree.h | 2 +-
 gcc/cp/decl.cc   | 9 +++--
 3 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/gcc/cp/class.cc b/gcc/cp/class.cc
index d5ae69b0fdf..df67ec34273 100644
--- a/gcc/cp/class.cc
+++ b/gcc/cp/class.cc
@@ -1402,7 +1402,7 @@ add_method (tree type, tree method, bool via_using)
   /* If these are versions of the same function, process and
 	 move on.  */
   if (TREE_CODE (fn) == FUNCTION_DECL
-	  && maybe_version_functions (method, fn, true))
+	  && maybe_version_functions (method, fn))
 	continue;
 
   if (DECL_INHERITED_CTOR (method))
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 84bcbf29fa0..9e66260ca3a 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7115,7 +7115,7 @@ extern void determine_local_discriminator	(tree, tree = NULL_TREE);
 extern bool member_like_constrained_friend_p	(tree);
 extern bool fns_correspond			(tree, tree);
 extern int decls_match(tree, tree, bool = true);
-extern bool maybe_version_functions		(tree, tree, bool);
+extern bool maybe_version_functions		(tree, tree);
 extern bool validate_constexpr_redeclaration	(tree, tree);
 extern bool merge_default_template_args		(tree, tree, bool);
 extern tree duplicate_decls			(tree, tree,
diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 05ad9bb24d5..45c8b304fc0 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -1216,9 +1216,7 @@ decls_match (tree newdecl, tree olddecl, bool record_versions /* = true */)
 	  && targetm.target_option.function_versions (newdecl, olddecl))
 	{
 	  if (record_versions)
-	maybe_version_functions (newdecl, olddecl,
- (!DECL_FUNCTION_VERSIONED (newdecl)
-  || !DECL_FUNCTION_VERSIONED (olddecl)));
+	maybe_version_functions (newdecl, olddecl);
 	  return 0;
 	}
 }
@@ -1289,7 +1287,7 @@ maybe_mark_function_versioned (tree decl)
If RECORD is set to true, record function versions.  */
 
 bool
-maybe_version_functions (tree newdecl, tree olddecl, bool record)
+maybe_version_functions (tree newdecl, tree olddecl)
 {
   if (!targetm.target_option.function_versions (newdecl, olddecl))
 return false;
@@ -1312,8 +1310,7 @@ maybe_version_functions (tree newdecl, tree olddecl, bool record)
   maybe_mark_function_versioned (newdecl);
 }
 
-  if (record)
-cgraph_node::record_function_versions (olddecl, newdecl);
+  cgraph_node::record_function_versions (olddecl, newdecl);
 
   return true;
 }


[PATCH v2 12/16] Refactor FMV name mangling.

2025-02-17 Thread Alfie Richards

This patch is an overhaul of how FMV name mangling works. Previously
mangling logic was duplicated in several places across both target
specific and independent code. This patch changes this such that all
mangling is done in targetm.mangle_decl_assembler_name (including for the
dispatched symbol and dispatcher resolver).

This allows for the removing of previous hacks, such as where the default
mangled decl's assembler name was unmangled to then remangle all versions
and the resolver and dispatched symbol.

This does introduce a change though (shown in test changes) where
previously x86 for target annotated FMV sets set the function name to
the assembler name and remangled this. This was hard to reproduce without
resorting to hacks I wasn't comfortable with so the mangling is changed
to append ".ifunc" which matches clang.

This change also refactors expand_target_clone using
targetm.mangle_decl_assembler_name for mangling and get_clone_versions.

gcc/ChangeLog:

* attribs.cc (make_dispatcher_decl): Move duplicated cgraph logic into
this function and change to use targetm.mangle_decl_assembler_name for
mangling.
* config/aarch64/aarch64.cc (aarch64_parse_fmv_features): Change to
support string_slice.
(aarch64_process_target_version_attr): Ditto.
(get_feature_mask_for_version): Ditto.
(aarch64_mangle_decl_assembler_name): Add logic for mangling dispatched
symbol and resolver.
(get_suffixed_assembler_name): Removed.
(make_resolver_func): Refactor to use
aarch64_mangle_decl_assembler_name for mangling.
(aarch64_generate_version_dispatcher_body): Remove remangling.
(aarch64_get_function_versions_dispatcher): Refactor to remove
duplicated cgraph logic.
* config/i386/i386-features.cc (is_valid_asm_symbol): Moved from
multiple_target.cc.
(create_new_asm_name): Ditto.
(ix86_mangle_function_version_assembler_name): Refactor to use
clone_identifier and to mangle default.
(ix86_mangle_decl_assembler_name): Add logic for mangling dispatched
symbol and resolver.
(ix86_get_function_versions_dispatcher): Remove duplicated cgraph
logic.
(make_resolver_func): Refactor to use ix86_mangle_decl_assembler_name
for mangling.
* config/riscv/riscv.cc (riscv_mangle_decl_assembler_name): Add logic
for FMV mangling.
(get_suffixed_assembler_name): Removed.
(make_resolver_func): Refactor to use riscv_mangle_decl_assembler_name
for mangling.
(riscv_generate_version_dispatcher_body): Remove unnecessary remangling.
(riscv_get_function_versions_dispatcher): Remove duplicated cgraph
logic.
* config/rs6000/rs6000.cc (rs6000_mangle_decl_assembler_name): New
function.
(rs6000_get_function_versions_dispatcher): Remove duplicated cgraph
logic.
(make_resolver_func): Refactor to use rs6000_mangle_decl_assembler_name
for mangling.
(is_valid_asm_symbol): Move from multiple_target.cc.
(create_new_asm_name): Ditto.
(rs6000_mangle_function_version_assembler_name): New function.
* multiple_target.cc (create_dispatcher_calls): Remove mangling code.
(get_attr_str): Removed.
(separate_attrs): Ditto.
(is_valid_asm_symbol): Moved to target specific.
(create_new_asm_name): Ditto.
(expand_target_clones): Refactor to use
targetm.mangle_decl_assembler_name for mangling and be more general.
* tree.cc (get_target_clone_attr_len): Removed.
* tree.h (get_target_clone_attr_len): Removed.

gcc/cp/ChangeLog:

* decl.cc (maybe_mark_function_versioned): Change to insert function 
version
and therefore record assembler name.

gcc/testsuite/ChangeLog:

* g++.target/i386/mv-symbols1.C: Update x86 FMV mangling.
* g++.target/i386/mv-symbols3.C: Ditto.
* g++.target/i386/mv-symbols4.C: Ditto.
* g++.target/i386/mv-symbols5.C: Ditto.
---
 gcc/attribs.cc  |  44 +++--
 gcc/config/aarch64/aarch64.cc   | 162 ++-
 gcc/config/i386/i386-features.cc| 108 ++
 gcc/config/riscv/riscv.cc   | 106 --
 gcc/config/rs6000/rs6000.cc | 115 +--
 gcc/cp/decl.cc  |   7 +
 gcc/multiple_target.cc  | 209 ++--
 gcc/testsuite/g++.target/i386/mv-symbols1.C |  12 +-
 gcc/testsuite/g++.target/i386/mv-symbols3.C |  10 +-
 gcc/testsuite/g++.target/i386/mv-symbols4.C |  10 +-
 gcc/testsuite/g++.target/i386/mv-symbols5.C |  10 +-
 gcc/tree.cc |  26 ---
 gcc/tree.h  |   1 -
 13 files changed, 395 insertions(+), 425 deletions(-)

diff --git a/gcc/attribs.cc b/gcc/attribs.cc
index b00d9529a8d..d0f37d77098 100644
---

[PATCH v2 11/16] Add clone_identifier function.

2025-02-17 Thread Alfie Richards

This is similar to clone_function_name and its siblings but takes an
identifier tree node rather than a function declaration.

This is to be used in conjunction with the identifier node stored in
cgraph_function_version_info::assembler_name to mangle FMV functions in
later patches.

gcc/ChangeLog:

* cgraph.h (clone_identifier): New function.
* cgraphclones.cc (clone_identifier): New function.
clone_function_name: Refactored to use clone_identifier.
---
 gcc/cgraph.h|  1 +
 gcc/cgraphclones.cc | 16 ++--
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 95ff8673827..f19b629e8eb 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -2629,6 +2629,7 @@ tree clone_function_name (const char *name, const char *suffix,
 tree clone_function_name (tree decl, const char *suffix,
 			  unsigned long number);
 tree clone_function_name (tree decl, const char *suffix);
+tree clone_identifier (tree decl, const char *suffix);
 
 void tree_function_versioning (tree, tree, vec *,
 			   ipa_param_adjustments *,
diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc
index 5332a433317..6b650849a63 100644
--- a/gcc/cgraphclones.cc
+++ b/gcc/cgraphclones.cc
@@ -557,6 +557,14 @@ clone_function_name (tree decl, const char *suffix)
   /* For consistency this needs to behave the same way as
  ASM_FORMAT_PRIVATE_NAME does, but without the final number
  suffix.  */
+  return clone_identifier (identifier, suffix);
+}
+
+/* Return a new clone of ID ending with the string SUFFIX.  */
+
+tree
+clone_identifier (tree id, const char *suffix)
+{
   char *separator = XALLOCAVEC (char, 2);
   separator[0] = symbol_table::symbol_suffix_separator ();
   separator[1] = 0;
@@ -565,15 +573,11 @@ clone_function_name (tree decl, const char *suffix)
 #else
   const char *prefix = "";
 #endif
-  char *result = ACONCAT ((prefix,
-			   IDENTIFIER_POINTER (identifier),
-			   separator,
-			   suffix,
-			   (char*)0));
+  char *result = ACONCAT (
+(prefix, IDENTIFIER_POINTER (id), separator, suffix, (char *) 0));
   return get_identifier (result);
 }
 
-
 /* Create callgraph node clone with new declaration.  The actual body will be
copied later at compilation stage.  The name of the new clone will be
constructed from the name of the original node, SUFFIX and NUM_SUFFIX.


[PATCH v2 13/16] Change target_version semantics to follow ACLE specification.

2025-02-17 Thread Alfie Richards

This changes behavior of target_clones and target_version attributes
to be inline with what is specified in the Arm C Language Extension.

Notably this changes the scope and signature of multiversioned functions
to that of the default version, and changes the resolver to be
created at the implementation of the default version.

This is achieved by changing the C++ front end to no longer resolve any
non-default version decls in lookup, and by moving dipatching
for default_target sets to reuse the dispatching logic for target_clones
in multiple_target.cc.

The dispatching in create_dispatcher_calls is changed for the case of
a lone annotated default function to change the dispatched symbol to
be an alias for the mangled default function.

gcc/ChangeLog:

* cgraphunit.cc (analyze_functions): Add logic for target version
dependencies.
* ipa.cc (symbol_table::remove_unreachable_nodes): Ditto.
* multiple_target.cc (create_dispatcher_calls): Change to support
target version semantics.
(ipa_target_clone): Change to dispatch all function sets in
target_version semantics.

gcc/cp/ChangeLog:

* call.cc (add_candidates): Change to not resolve non-default versions 
in
target_version semantics.
* class.cc (resolve_address_of_overloaded_function): Ditto.
* cp-gimplify.cc (cp_genericize_r): Change logic to not apply for
target_version semantics.
* decl.cc (start_decl): Change to mark and therefore mangle all
target_version decls.
(start_preparsed_function): Ditto.
* typeck.cc (cp_build_function_call_vec): Add error for calling 
unresolvable
non-default node in target_version semantics.

gcc/testsuite/ChangeLog:

* g++.target/aarch64/mv-1.C: Change for target_version semantics.
* g++.target/aarch64/mv-symbols2.C: Ditto.
* g++.target/aarch64/mv-symbols3.C: Ditto.
* g++.target/aarch64/mv-symbols4.C: Ditto.
* g++.target/aarch64/mv-symbols5.C: Ditto.
* g++.target/aarch64/mvc-symbols3.C: Ditto.
* g++.target/riscv/mv-symbols2.C: Ditto.
* g++.target/riscv/mv-symbols3.C: Ditto.
* g++.target/riscv/mv-symbols4.C: Ditto.
* g++.target/riscv/mv-symbols5.C: Ditto.
* g++.target/riscv/mvc-symbols3.C: Ditto.
* g++.target/aarch64/mv-symbols10.C: New test.
* g++.target/aarch64/mv-symbols11.C: New test.
* g++.target/aarch64/mv-symbols12.C: New test.
* g++.target/aarch64/mv-symbols13.C: New test.
* g++.target/aarch64/mv-symbols6.C: New test.
* g++.target/aarch64/mv-symbols7.C: New test.
* g++.target/aarch64/mv-symbols8.C: New test.
* g++.target/aarch64/mv-symbols9.C: New test.
---
 gcc/cgraphunit.cc |  9 +++
 gcc/cp/call.cc| 10 +++
 gcc/cp/class.cc   | 13 +++-
 gcc/cp/cp-gimplify.cc | 11 ++-
 gcc/cp/decl.cc| 14 
 gcc/cp/typeck.cc  | 10 +++
 gcc/ipa.cc| 11 +++
 gcc/multiple_target.cc| 73 ---
 gcc/testsuite/g++.target/aarch64/mv-1.C   |  4 +
 .../g++.target/aarch64/mv-symbols10.C | 27 +++
 .../g++.target/aarch64/mv-symbols11.C | 30 
 .../g++.target/aarch64/mv-symbols12.C | 28 +++
 .../g++.target/aarch64/mv-symbols13.C | 28 +++
 .../g++.target/aarch64/mv-symbols2.C  | 12 +--
 .../g++.target/aarch64/mv-symbols3.C  |  6 +-
 .../g++.target/aarch64/mv-symbols4.C  |  6 +-
 .../g++.target/aarch64/mv-symbols5.C  |  6 +-
 .../g++.target/aarch64/mv-symbols6.C  | 25 +++
 .../g++.target/aarch64/mv-symbols7.C  | 48 
 .../g++.target/aarch64/mv-symbols8.C  | 46 
 .../g++.target/aarch64/mv-symbols9.C  | 43 +++
 .../g++.target/aarch64/mvc-symbols3.C | 12 +--
 gcc/testsuite/g++.target/riscv/mv-symbols2.C  | 12 +--
 gcc/testsuite/g++.target/riscv/mv-symbols3.C  |  6 +-
 gcc/testsuite/g++.target/riscv/mv-symbols4.C  |  6 +-
 gcc/testsuite/g++.target/riscv/mv-symbols5.C  |  6 +-
 gcc/testsuite/g++.target/riscv/mvc-symbols3.C | 12 +--
 27 files changed, 456 insertions(+), 58 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-symbols10.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-symbols11.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-symbols12.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-symbols13.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-symbols6.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-symbols7.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-symbols8.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-symbols9.C

diff --git a/gcc/cgraphunit.cc b/gcc/cgraphunit.cc
index 8

Re: [PATCH] pair-fusion: Check for invalid use arrays [PR118320]

2025-02-17 Thread Alex Coplan
On 29/01/2025 18:46, Richard Sandiford wrote:
> As Andrew says in the bugzilla comments, this PR is about a case where
> we tried to fuse two stores of x0, one in which x0 was defined and one
> in which it was undefined.  merge_access_arrays failed on the conflict,
> but the failure wasn't caught.
> 
> Normally the hazard detection code would fail if the instructions
> had incompatible uses.  However, an undefined use doesn't impose
> many restrictions on movements.  I think this is likely to be the
> only case where hazard detection isn't enough.
> 
> As Andrew notes in bugzilla, it might be possible to allow uses
> of defined and undefined values to be merged to the defined value.
> But that sounds dangerous in the general case, as an rtl-ssa-level
> decision.  We might run the risk of turning conditional UB into
> unconditional UB.  And LLVM proves that the definition of "undef"
> isn't simple.
> 
> Tested on aarch64-linux-gnu.  OK to install?

Thanks for taking care of this.  LGTM, but I have a question below, just
for my own understanding ...

> 
> Richard
> 
> 
> gcc/
>   PR rtl-optimization/118320
>   * pair-fusion.cc (pair_fusion_bb_info::fuse_pair): Commonize
>   the merge of input_uses and return early if it fails.
> 
> gcc/testsuite/
>   PR rtl-optimization/118320
>   * g++.dg/torture/pr118320.C: New test.
> ---
>  gcc/pair-fusion.cc  | 32 -
>  gcc/testsuite/g++.dg/torture/pr118320.C | 15 
>  2 files changed, 36 insertions(+), 11 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/torture/pr118320.C
> 
> diff --git a/gcc/pair-fusion.cc b/gcc/pair-fusion.cc
> index 602e572ab6c..5708d0f3b67 100644
> --- a/gcc/pair-fusion.cc
> +++ b/gcc/pair-fusion.cc
> @@ -1730,6 +1730,24 @@ pair_fusion_bb_info::fuse_pair (bool load_p,
>input_uses[i] = remove_note_accesses (attempt, input_uses[i]);
>  }
>  
> +  // Get the uses that the pair instruction would have, and punt if
> +  // the unpaired instructions use different definitions of the same
> +  // register.  That would normally be caught as a side-effect of
> +  // hazard detection below, but this check also deals with cases
> +  // in which one use is undefined and the other isn't.
> +  auto new_uses = merge_access_arrays (attempt,
> +drop_memory_access (input_uses[0]),
> +drop_memory_access (input_uses[1]));
> +  if (!new_uses.is_valid ())
> +{
> +  if (dump_file)
> + fprintf (dump_file,
> +  "  load pair: i%d and i%d use different definiitions of"

... how do we know that this is a load pair here?  Could this not in
theory trigger for stores too?

Thanks,
Alex

> +  " the same register\n",
> +  insns[0]->uid (), insns[1]->uid ());
> +  return false;
> +}
> +
>// Edge case: if the first insn is a writeback load and the
>// second insn is a non-writeback load which transfers into the base
>// register, then we should drop the writeback altogether as the
> @@ -1852,11 +1870,7 @@ pair_fusion_bb_info::fuse_pair (bool load_p,
>  input_defs[1]);
>gcc_assert (pair_change->new_defs.is_valid ());
>  
> -  pair_change->new_uses
> - = merge_access_arrays (attempt,
> -drop_memory_access (input_uses[0]),
> -drop_memory_access (input_uses[1]));
> -  gcc_assert (pair_change->new_uses.is_valid ());
> +  pair_change->new_uses = new_uses;
>set_pair_pat (pair_change);
>  }
>else
> @@ -1877,9 +1891,7 @@ pair_fusion_bb_info::fuse_pair (bool load_p,
>   case Action::CHANGE:
>   {
> set_pair_pat (change);
> -   change->new_uses = merge_access_arrays (attempt,
> -   input_uses[0],
> -   input_uses[1]);
> +   change->new_uses = new_uses;
> auto d1 = drop_memory_access (input_defs[0]);
> auto d2 = drop_memory_access (input_defs[1]);
> change->new_defs = merge_access_arrays (attempt, d1, d2);
> @@ -1907,9 +1919,7 @@ pair_fusion_bb_info::fuse_pair (bool load_p,
> auto new_insn = crtl->ssa->create_insn (attempt, INSN, pair_pat);
> change = make_change (new_insn);
> change->move_range = move_range;
> -   change->new_uses = merge_access_arrays (attempt,
> -   input_uses[0],
> -   input_uses[1]);
> +   change->new_uses = new_uses;
> gcc_assert (change->new_uses.is_valid ());
>  
> auto d1 = drop_memory_access (input_defs[0]);
> diff --git a/gcc/testsuite/g++.dg/torture/pr118320.C 
> b/gcc/testsuite/g++.dg/torture/pr118320.C
> new file mode 100644
> index 000..228d7986

[PATCH v2 16/16] Remove FMV beta warning.

2025-02-17 Thread Alfie Richards

This patch removes the warning for target_version and target_clones
in aarch64 as it is now spec compliant.

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_process_target_version_attr):
Remove warning.
* config/aarch64/aarch64.opt: Mark -Wno-experimental-fmv-target
deprecated.
* doc/invoke.texi: Ditto.

gcc/testsuite/ChangeLog:

* g++.target/aarch64/mv-1.C: Remove option.
* g++.target/aarch64/mv-and-mvc-error1.C: Ditto.
* g++.target/aarch64/mv-and-mvc-error2.C: Ditto.
* g++.target/aarch64/mv-and-mvc-error3.C: Ditto.
* g++.target/aarch64/mv-and-mvc1.C: Ditto.
* g++.target/aarch64/mv-and-mvc2.C: Ditto.
* g++.target/aarch64/mv-and-mvc3.C: Ditto.
* g++.target/aarch64/mv-and-mvc4.C: Ditto.
* g++.target/aarch64/mv-error1.C: Ditto.
* g++.target/aarch64/mv-error2.C: Ditto.
* g++.target/aarch64/mv-error3.C: Ditto.
* g++.target/aarch64/mv-error4.C: Ditto.
* g++.target/aarch64/mv-error5.C: Ditto.
* g++.target/aarch64/mv-error6.C: Ditto.
* g++.target/aarch64/mv-error7.C: Ditto.
* g++.target/aarch64/mv-error8.C: Ditto.
* g++.target/aarch64/mv-pragma.C: Ditto.
* g++.target/aarch64/mv-symbols1.C: Ditto.
* g++.target/aarch64/mv-symbols10.C: Ditto.
* g++.target/aarch64/mv-symbols11.C: Ditto.
* g++.target/aarch64/mv-symbols12.C: Ditto.
* g++.target/aarch64/mv-symbols13.C: Ditto.
* g++.target/aarch64/mv-symbols2.C: Ditto.
* g++.target/aarch64/mv-symbols3.C: Ditto.
* g++.target/aarch64/mv-symbols4.C: Ditto.
* g++.target/aarch64/mv-symbols5.C: Ditto.
* g++.target/aarch64/mv-symbols6.C: Ditto.
* g++.target/aarch64/mv-symbols7.C: Ditto.
* g++.target/aarch64/mv-symbols8.C: Ditto.
* g++.target/aarch64/mv-symbols9.C: Ditto.
* g++.target/aarch64/mvc-error1.C: Ditto.
* g++.target/aarch64/mvc-error2.C: Ditto.
* g++.target/aarch64/mvc-symbols1.C: Ditto.
* g++.target/aarch64/mvc-symbols2.C: Ditto.
* g++.target/aarch64/mvc-symbols3.C: Ditto.
* g++.target/aarch64/mvc-symbols4.C: Ditto.
* g++.target/aarch64/mv-warning1.C: Removed.
* g++.target/aarch64/mvc-warning1.C: Removed.
---
 gcc/config/aarch64/aarch64.cc| 9 -
 gcc/config/aarch64/aarch64.opt   | 2 +-
 gcc/doc/invoke.texi  | 5 +
 gcc/testsuite/g++.target/aarch64/mv-1.C  | 1 -
 gcc/testsuite/g++.target/aarch64/mv-and-mvc-error1.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-and-mvc-error2.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-and-mvc-error3.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-and-mvc1.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-and-mvc2.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-and-mvc3.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-and-mvc4.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-error1.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-error2.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-error3.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-error4.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-error5.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-error6.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-error7.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-error8.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-pragma.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols1.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols10.C  | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols11.C  | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols12.C  | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols13.C  | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols2.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols3.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols4.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols5.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols6.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols7.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols8.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols9.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-warning1.C   | 9 -
 gcc/testsuite/g++.target/aarch64/mvc-error1.C| 1 -
 gcc/testsuite/g++.target/aarch64/mvc-error2.C| 1 -
 gcc/testsuite/g++.target/aarch64/mvc-symbols1.C  | 1 -
 gcc/testsuite/g++.target/aarch64/mvc-symbols2.C  | 1 -
 gcc/testsuite/g++.target/aarch64/mvc-symbols3.C  | 1 -
 gcc/testsuite/g++.target/aarch64/mvc-symbols4.C  | 1 -
 gcc/testsuite/g++.target/aarch64/mvc-warning1.C  | 6 --
 41 files changed, 2 insertions(+), 65 deletions(-)
 delete mode 100644 gcc/testsuite/g++.target/aarch64/mv-warning1.C
 delete mode 100644 gcc/testsuite/g++.target/aarch64/mvc-warni

Re: [PATCH] pair-fusion: Check for invalid use arrays [PR118320]

2025-02-17 Thread Richard Sandiford
Alex Coplan  writes:
> On 29/01/2025 18:46, Richard Sandiford wrote:
>> As Andrew says in the bugzilla comments, this PR is about a case where
>> we tried to fuse two stores of x0, one in which x0 was defined and one
>> in which it was undefined.  merge_access_arrays failed on the conflict,
>> but the failure wasn't caught.
>> 
>> Normally the hazard detection code would fail if the instructions
>> had incompatible uses.  However, an undefined use doesn't impose
>> many restrictions on movements.  I think this is likely to be the
>> only case where hazard detection isn't enough.
>> 
>> As Andrew notes in bugzilla, it might be possible to allow uses
>> of defined and undefined values to be merged to the defined value.
>> But that sounds dangerous in the general case, as an rtl-ssa-level
>> decision.  We might run the risk of turning conditional UB into
>> unconditional UB.  And LLVM proves that the definition of "undef"
>> isn't simple.
>> 
>> Tested on aarch64-linux-gnu.  OK to install?
>
> Thanks for taking care of this.  LGTM, but I have a question below, just
> for my own understanding ...
>
>> 
>> Richard
>> 
>> 
>> gcc/
>>  PR rtl-optimization/118320
>>  * pair-fusion.cc (pair_fusion_bb_info::fuse_pair): Commonize
>>  the merge of input_uses and return early if it fails.
>> 
>> gcc/testsuite/
>>  PR rtl-optimization/118320
>>  * g++.dg/torture/pr118320.C: New test.
>> ---
>>  gcc/pair-fusion.cc  | 32 -
>>  gcc/testsuite/g++.dg/torture/pr118320.C | 15 
>>  2 files changed, 36 insertions(+), 11 deletions(-)
>>  create mode 100644 gcc/testsuite/g++.dg/torture/pr118320.C
>> 
>> diff --git a/gcc/pair-fusion.cc b/gcc/pair-fusion.cc
>> index 602e572ab6c..5708d0f3b67 100644
>> --- a/gcc/pair-fusion.cc
>> +++ b/gcc/pair-fusion.cc
>> @@ -1730,6 +1730,24 @@ pair_fusion_bb_info::fuse_pair (bool load_p,
>>input_uses[i] = remove_note_accesses (attempt, input_uses[i]);
>>  }
>>  
>> +  // Get the uses that the pair instruction would have, and punt if
>> +  // the unpaired instructions use different definitions of the same
>> +  // register.  That would normally be caught as a side-effect of
>> +  // hazard detection below, but this check also deals with cases
>> +  // in which one use is undefined and the other isn't.
>> +  auto new_uses = merge_access_arrays (attempt,
>> +   drop_memory_access (input_uses[0]),
>> +   drop_memory_access (input_uses[1]));
>> +  if (!new_uses.is_valid ())
>> +{
>> +  if (dump_file)
>> +fprintf (dump_file,
>> + "  load pair: i%d and i%d use different definiitions of"
>
> ... how do we know that this is a load pair here?  Could this not in
> theory trigger for stores too?


You're right, and as mentioned above, stores were the motivating case.
I'd just copied this style mechanically from a neighbouring dump message,
on the assumption that "load pair" was being used a generic term,
but I see now that it isn't.

So yeah, please feel free to correct to whatever you think is better.
(The patch is already committed, since there was a request to get this
fixed while you were away.)

Thanks,
Richard


Re: [PATCH v2] ira: Add a target hook for callee-saved register cost scale

2025-02-17 Thread Richard Sandiford
Jan Hubicka  writes:
>> As described below, the patch also shows no change to AArch64 SPEC2017
>> scores.  I'm afraid I'll need help from x86 folks to do performance
>> testing there.
>
> I will look into this over weekend. I can write x86 version of the
> hooks. Though in earlier email you mentioned you hacked up something, so
> if you can share it with me, perhaps I can start from there.

Sorry, was away until today.  But the earlier hacked up version was
essentially what I included in this patch.

Thanks,
Richard


Re: [PATCH] COBOL 3/15 92K bld: config and build machinery

2025-02-17 Thread Richard Biener
On Sat, Feb 15, 2025 at 10:20 PM Sam James  wrote:
>
> "James K. Lowden"  writes:
>
> > From 5d53920602e234e4d99ae2d502e662ee3699978e 4 Oct 2024 12:01:22 -0400
> > From: "James K. Lowden" 
> > Date: Sat 15 Feb 2025 12:50:52 PM EST
> > Subject: [PATCH] Add 'cobol' to 17 files
>
> The commit message summary (first line) should say something like the
> email title, so 'cobol: bld: config and build machinery'.
> >
> > ChangeLog
> >   * Makefile.def: Add libgcobol module and cobol language.
> >   * Makefile.in: Add libgcobol module and cobol language.
> >   * configure: Regenerate.
> >   * configure.ac: Add libgcobol module and cobol language.
> >
> > gcc/ChangeLog
> >   * common.opt: New file.
> >   * dwarf2out.cc: Add cobol language.
> >
> > gcc/cobol/ChangeLog
> >   * LICENSE: New file.
> >   * Make-lang.in: New file.
> >   * config-lang.in: New file.
> >   * lang.opt: New file.
> >   * lang.opt.urls: New file.
> >
> > libgcobol/ChangeLog
> >   * Makefile.in: New file.
> >   * acinclude.m4: New file.
> >   * aclocal.m4: New file.
> >   * configure.ac: New file.
> >   * configure.tgt: New file.
> >
> > maintainer-scripts/ChangeLog
> >   * update_web_docs_git: Add libgcobol module and cobol language.
> >
> > ---
> > Makefile.def | ++-
> > Makefile.in | 
> > -
> > configure | 
> > --
> > configure.ac | -
> > gcc/cobol/LICENSE | +-
> > gcc/cobol/Make-lang.in | 
> > +-
> > gcc/cobol/config-lang.in | ++-
> > gcc/cobol/lang.opt | 
> > -
> > gcc/cobol/lang.opt.urls | +-
> > gcc/common.opt | -
> > gcc/dwarf2out.cc | +-
> > libgcobol/Makefile.in | 
> > -
> > libgcobol/acinclude.m4 | ++-
> > libgcobol/aclocal.m4 | 
> > +-
> > libgcobol/configure.ac | 
> > +-
> > libgcobol/configure.tgt | 
> > +++-
> > maintainer-scripts/update_web_docs_git | +
> > 17 files changed, 2244 insertions(+), 22 deletions(-)
> > diff --git a/Makefile.def b/Makefile.def
> > index 19954e7d731..1192e852c7a 100644
> > --- a/Makefile.def
> > +++ b/Makefile.def
> > @@ -209,6 +209,7 @@ target_modules = { module= libgomp; bootstrap= true; 
> > lib_path=.libs; };
> >  target_modules = { module= libitm; lib_path=.libs; };
> >  target_modules = { module= libatomic; bootstrap=true; lib_path=.libs; };
> >  target_modules = { module= libgrust; };
> > +target_modules = { module= libgcobol; };
> >
> >  // These are (some of) the make targets to be done in each subdirectory.
> >  // Not all; these are the ones which don't have special options.
> > @@ -324,6 +325,7 @@ flags_to_pass = { flag= CXXFLAGS_FOR_TARGET ; };
> >  flags_to_pass = { flag= DLLTOOL_FOR_TARGET ; };
> >  flags_to_pass = { flag= DSYMUTIL_FOR_TARGET ; };
> >  flags_to_pass = { flag= FLAGS_FOR_TARGET ; };
> > +flags_to_pass = { flag= GCOBOL

[PATCH v2 02/16] Add x86 FMV symbol tests

2025-02-17 Thread Alfie Richards

This is for testing the x86 mangling of FMV versioned function
assembly names.

gcc/testsuite/ChangeLog:

* g++.target/i386/mv-symbols1.C: New test.
* g++.target/i386/mv-symbols2.C: New test.
* g++.target/i386/mv-symbols3.C: New test.
* g++.target/i386/mv-symbols4.C: New test.
* g++.target/i386/mv-symbols5.C: New test.
* g++.target/i386/mvc-symbols1.C: New test.
* g++.target/i386/mvc-symbols2.C: New test.
* g++.target/i386/mvc-symbols3.C: New test.
* g++.target/i386/mvc-symbols4.C: New test.

Co-authored-by: Alfie Richards 
---
 gcc/testsuite/g++.target/i386/mv-symbols1.C  | 68 
 gcc/testsuite/g++.target/i386/mv-symbols2.C  | 56 
 gcc/testsuite/g++.target/i386/mv-symbols3.C  | 44 +
 gcc/testsuite/g++.target/i386/mv-symbols4.C  | 50 ++
 gcc/testsuite/g++.target/i386/mv-symbols5.C  | 56 
 gcc/testsuite/g++.target/i386/mvc-symbols1.C | 44 +
 gcc/testsuite/g++.target/i386/mvc-symbols2.C | 29 +
 gcc/testsuite/g++.target/i386/mvc-symbols3.C | 35 ++
 gcc/testsuite/g++.target/i386/mvc-symbols4.C | 23 +++
 9 files changed, 405 insertions(+)
 create mode 100644 gcc/testsuite/g++.target/i386/mv-symbols1.C
 create mode 100644 gcc/testsuite/g++.target/i386/mv-symbols2.C
 create mode 100644 gcc/testsuite/g++.target/i386/mv-symbols3.C
 create mode 100644 gcc/testsuite/g++.target/i386/mv-symbols4.C
 create mode 100644 gcc/testsuite/g++.target/i386/mv-symbols5.C
 create mode 100644 gcc/testsuite/g++.target/i386/mvc-symbols1.C
 create mode 100644 gcc/testsuite/g++.target/i386/mvc-symbols2.C
 create mode 100644 gcc/testsuite/g++.target/i386/mvc-symbols3.C
 create mode 100644 gcc/testsuite/g++.target/i386/mvc-symbols4.C

diff --git a/gcc/testsuite/g++.target/i386/mv-symbols1.C b/gcc/testsuite/g++.target/i386/mv-symbols1.C
new file mode 100644
index 000..1290299aea5
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/mv-symbols1.C
@@ -0,0 +1,68 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O0" } */
+
+__attribute__((target("default")))
+int foo ()
+{
+  return 1;
+}
+
+__attribute__((target("arch=slm")))
+int foo ()
+{
+  return 3;
+}
+
+__attribute__((target("sse4.2")))
+int foo ()
+{
+  return 5;
+}
+
+__attribute__((target("sse4.2")))
+int foo (int)
+{
+  return 6;
+}
+
+__attribute__((target("arch=slm")))
+int foo (int)
+{
+  return 4;
+}
+
+__attribute__((target("default")))
+int foo (int)
+{
+  return 2;
+}
+
+int bar()
+{
+  return foo ();
+}
+
+int bar(int x)
+{
+  return foo (x);
+}
+
+/* When updating any of the symbol names in these tests, make sure to also
+   update any tests for their absence in mvc-symbolsN.C */
+
+/* { dg-final { scan-assembler-times "\n_Z3foov:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.arch_slm:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.sse4.2:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.resolver:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\tcall\t_Z7_Z3foovv\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.type\t_Z7_Z3foovv, @gnu_indirect_function\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.set\t_Z7_Z3foovv,_Z3foov\.resolver\n" 1 } } */
+
+/* { dg-final { scan-assembler-times "\n_Z3fooi:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3fooi\.arch_slm:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3fooi\.sse4.2:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3fooi\.resolver:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\tcall\t_Z7_Z3fooii\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.type\t_Z7_Z3fooii, @gnu_indirect_function\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.set\t_Z7_Z3fooii,_Z3fooi\.resolver\n" 1 } } */
diff --git a/gcc/testsuite/g++.target/i386/mv-symbols2.C b/gcc/testsuite/g++.target/i386/mv-symbols2.C
new file mode 100644
index 000..8b75565d78d
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/mv-symbols2.C
@@ -0,0 +1,56 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O0" } */
+
+__attribute__((target("default")))
+int foo ()
+{
+  return 1;
+}
+
+__attribute__((target("arch=slm")))
+int foo ()
+{
+  return 3;
+}
+
+__attribute__((target("sse4.2")))
+int foo ()
+{
+  return 5;
+}
+
+__attribute__((target("sse4.2")))
+int foo (int)
+{
+  return 6;
+}
+
+__attribute__((target("arch=slm")))
+int foo (int)
+{
+  return 4;
+}
+
+__attribute__((target("default")))
+int foo (int)
+{
+  return 2;
+}
+
+/* When updating any of the symbol names in these tests, make sure to also
+   update any tests for their absence in mvc-symbolsN.C */
+
+/* { dg-final { scan-assembler-times "\n_Z3foov:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.arch_slm:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.sse4.2:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.resolve

[PATCH v2 10/16] Add dispatcher_resolver_function and is_target_clone to cgraph_node.

2025-02-17 Thread Alfie Richards

These flags are used to make sure mangling is done correctly.

gcc/ChangeLog:

* cgraph.h (struct cgraph_node): Add dispatcher_resolver_function and
is_target_clone.
---
 gcc/cgraph.h | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index d9177364b7a..95ff8673827 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -907,7 +907,9 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : public symtab_node
   used_as_abstract_origin (false),
   lowered (false), process (false), frequency (NODE_FREQUENCY_NORMAL),
   only_called_at_startup (false), only_called_at_exit (false),
-  tm_clone (false), dispatcher_function (false), calls_comdat_local (false),
+  tm_clone (false), dispatcher_function (false),
+  dispatcher_resolver_function (false), is_target_clone (false),
+  calls_comdat_local (false),
   icf_merged (false), nonfreeing_fn (false), merged_comdat (false),
   merged_extern_inline (false), parallelized_function (false),
   split_part (false), indirect_call_target (false), local (false),
@@ -1465,6 +1467,11 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : public symtab_node
   unsigned tm_clone : 1;
   /* True if this decl is a dispatcher for function versions.  */
   unsigned dispatcher_function : 1;
+  /* True if this decl is a resolver for function versions.  */
+  unsigned dispatcher_resolver_function : 1;
+  /* True this is part of a multiversioned set and the default version
+ comes from a target_clone attribute.  */
+  unsigned is_target_clone : 1;
   /* True if this decl calls a COMDAT-local function.  This is set up in
  compute_fn_summary and inline_call.  */
   unsigned calls_comdat_local : 1;


[PATCH v2 07/16] Add version of make_attribute supporting string_slice.

2025-02-17 Thread Alfie Richards

gcc/ChangeLog:

* attribs.cc (make_attribute): New function overload.
* attribs.h (make_attribute): New function overload.
---
 gcc/attribs.cc | 14 +-
 gcc/attribs.h  |  1 +
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/gcc/attribs.cc b/gcc/attribs.cc
index f6667839c01..b00d9529a8d 100644
--- a/gcc/attribs.cc
+++ b/gcc/attribs.cc
@@ -1090,7 +1090,19 @@ make_attribute (const char *name, const char *arg_name, tree chain)
   return attr;
 }
 
-
+/* Makes a function attribute of the form NAME (ARG_NAME) and chains
+   it to CHAIN.  */
+
+tree
+make_attribute (string_slice name, string_slice arg_name, tree chain)
+{
+  tree attr_name = get_identifier_with_length (name.begin (), name.size ());
+  tree attr_arg_name = build_string (arg_name.size (), arg_name.begin ());
+  tree attr_args = tree_cons (NULL_TREE, attr_arg_name, NULL_TREE);
+  tree attr = tree_cons (attr_name, attr_args, chain);
+  return attr;
+}
+
 /* Common functions used for target clone support.  */
 
 /* Comparator function to be used in qsort routine to sort attribute
diff --git a/gcc/attribs.h b/gcc/attribs.h
index 4b946390f76..e7d592c5b41 100644
--- a/gcc/attribs.h
+++ b/gcc/attribs.h
@@ -46,6 +46,7 @@ extern tree get_attribute_name (const_tree);
 extern tree get_attribute_namespace (const_tree);
 extern void apply_tm_attr (tree, tree);
 extern tree make_attribute (const char *, const char *, tree);
+extern tree make_attribute (string_slice, string_slice, tree);
 extern bool attribute_ignored_p (tree);
 extern bool attribute_ignored_p (const attribute_spec *const);
 extern bool any_nonignored_attribute_p (tree);


[PATCH v2 03/16] Add string_slice class.

2025-02-17 Thread Alfie Richards

The string_slice inherits from array_slice and is used to refer to a
substring of an array that is memory managed elsewhere without modifying
the underlying array.

For example, this is useful in cases such as when needing to refer to a
substring of an attribute in the syntax tree.

This commit also adds some minimal helper functions for string_slice,
such as a strtok alternative, equality operators, strcmp, and a function
to strip whitespace from the beginning and end of a string_slice.

gcc/ChangeLog:

* vec.cc (string_slice::strtok): New method.
(strcmp): Add implementation for string_slice.
(string_slice::strip): New method.
(test_string_slice_initializers): New test.
(test_string_slice_strtok): Ditto.
(test_string_slice_strcmp): Ditto.
(test_string_slice_equality): Ditto.
(test_string_slice_invalid): Ditto.
(test_string_slice_strip): Ditto.
(vec_cc_tests): Add new tests.
* vec.h (class string_slice): New class.
(strcmp): Add implementation for string_slice.
---
 gcc/vec.cc | 210 +
 gcc/vec.h  |  65 +
 2 files changed, 275 insertions(+)

diff --git a/gcc/vec.cc b/gcc/vec.cc
index 55f5f3dd447..189cb492c7e 100644
--- a/gcc/vec.cc
+++ b/gcc/vec.cc
@@ -176,6 +176,61 @@ dump_vec_loc_statistics (void)
   vec_mem_desc.dump (VEC_ORIGIN);
 }
 
+string_slice
+string_slice::tokenize (string_slice *str, string_slice delims)
+{
+  const char *ptr = str->begin ();
+
+  gcc_assert (str->is_valid () && delims.is_valid ());
+
+  for (; ptr < str->end (); ptr++)
+for (char c : delims)
+  if (*ptr == c)
+	{
+	  /* Update the input string to be the remaining string.  */
+	  const char* str_begin = str->begin ();
+	  *str = string_slice (ptr  + 1, str->end ());
+	  return string_slice (str_begin, ptr);
+	}
+
+  /* If no deliminators between the start and end, return the whole string.  */
+  string_slice res = *str;
+  *str = string_slice::invalid ();
+  return res;
+}
+
+int
+strcmp (string_slice str1, string_slice str2)
+{
+  for (unsigned int i = 0; i < str1.size () && i < str2.size (); i++)
+{
+  if (str1[i] < str2[i])
+	return -1;
+  if (str1[i] > str2[i])
+	return 1;
+}
+
+  if (str1.size () < str2.size ())
+return -1;
+  if (str1.size () > str2.size ())
+return 1;
+  return 0;
+}
+
+string_slice
+string_slice::strip ()
+{
+  const char *start = this->begin ();
+  const char *end = this->end ();
+
+  while (start < end && ISSPACE (*start))
+start++;
+  while (end > start && ISSPACE (*(end-1)))
+end--;
+
+  return string_slice (start, end-start);
+}
+
 #if CHECKING_P
 /* Report qsort comparator CMP consistency check failure with P1, P2, P3 as
witness elements.  */
@@ -584,6 +639,154 @@ test_auto_alias ()
   ASSERT_EQ (val, 0);
 }
 
+static void
+test_string_slice_initializers ()
+{
+  string_slice str1 = string_slice ();
+  ASSERT_TRUE (str1.is_valid ());
+  ASSERT_EQ (str1.size (), 0);
+
+  string_slice str2 = string_slice ("Test string");
+  ASSERT_TRUE (str2.is_valid ());
+  ASSERT_EQ (str2.size (), 11);
+
+  string_slice str3 = string_slice ("Test string", 4);
+  ASSERT_TRUE (str3.is_valid ());
+  ASSERT_EQ (str3.size (), 4);
+}
+
+static void
+test_string_slice_tokenize ()
+{
+  string_slice test_string_slice = string_slice ("");
+  string_slice test_delims = string_slice (",");
+
+  ASSERT_EQ (string_slice::tokenize (&test_string_slice, test_delims),
+	 string_slice (""));
+  ASSERT_FALSE (test_string_slice.is_valid ());
+
+  test_string_slice = string_slice (",");
+  test_delims = string_slice (",");
+  ASSERT_EQ (string_slice::tokenize (&test_string_slice, test_delims),
+	 string_slice (""));
+  ASSERT_EQ (string_slice::tokenize (&test_string_slice, test_delims),
+	 string_slice (""));
+  ASSERT_FALSE (test_string_slice.is_valid ());
+
+  test_string_slice = string_slice (",test.,.test, ,  test  ");
+  test_delims = string_slice (",.");
+  ASSERT_EQ (string_slice::tokenize (&test_string_slice, test_delims),
+	 string_slice (""));
+  ASSERT_EQ (string_slice::tokenize (&test_string_slice, test_delims),
+	 string_slice ("test"));
+  ASSERT_EQ (string_slice::tokenize (&test_string_slice, test_delims),
+	 string_slice (""));
+  ASSERT_EQ (string_slice::tokenize (&test_string_slice, test_delims),
+	 string_slice (""));
+  ASSERT_EQ (string_slice::tokenize (&test_string_slice, test_delims),
+	 string_slice ("test"));
+  ASSERT_EQ (string_slice::tokenize (&test_string_slice, test_delims),
+	 string_slice (" "));
+  ASSERT_EQ (string_slice::tokenize (&test_string_slice, test_delims),
+	 string_slice ("  test  "));
+  ASSERT_FALSE (test_string_slice.is_valid ());
+
+  const char *test_string
+= "This is the test string, it \0 is for testing, 123 ,,";
+  test_string_slice = string_slice (test_string, 52);
+  test_delims = string_slice (",\0", 2);
+
+  ASSERT_EQ (string_slic

[PATCH v2 08/16] Add get_clone_versions function.

2025-02-17 Thread Alfie Richards

This is a reimplementation of get_target_clone_attr_len,
get_attr_str, and separate_attrs using string_slice and auto_vec to make
memory management and use simpler.

gcc/c-family/ChangeLog:

* c-attribs.cc (handle_target_clones_attribute): Change to use
get_clone_versions.

gcc/ChangeLog:

* tree.cc (get_clone_versions): New function.
(get_clone_attr_versions): New function.
* tree.h (get_clone_versions): New function.
(get_clone_attr_versions): New function.
---
 gcc/c-family/c-attribs.cc |  2 +-
 gcc/tree.cc   | 40 +++
 gcc/tree.h|  3 +++
 3 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index f3181e7b57c..642d724f6c6 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -6129,7 +6129,7 @@ handle_target_clones_attribute (tree *node, tree name, tree ARG_UNUSED (args),
 	}
 	}
 
-  if (get_target_clone_attr_len (args) == -1)
+  if (get_clone_attr_versions (args).length () == 1)
 	{
 	  warning (OPT_Wattributes,
 		   "single % attribute is ignored");
diff --git a/gcc/tree.cc b/gcc/tree.cc
index 0743ed71c78..83dc9f32f96 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -15356,6 +15356,46 @@ get_target_clone_attr_len (tree arglist)
   return str_len_sum;
 }
 
+/* Returns an auto_vec of string_slices containing the version strings from
+   ARGLIST.  DEFAULT_COUNT is incremented for each default version found.  */
+
+auto_vec
+get_clone_attr_versions (const tree arglist, int *default_count)
+{
+  gcc_assert (TREE_CODE (arglist) == TREE_LIST);
+  auto_vec versions;
+
+  static const char separator_str[] = {TARGET_CLONES_ATTR_SEPARATOR, 0};
+  string_slice separators = string_slice (separator_str);
+
+  for (tree arg = arglist; arg; arg = TREE_CHAIN (arg))
+{
+  string_slice str = string_slice (TREE_STRING_POINTER (TREE_VALUE (arg)));
+  while (str.is_valid ())
+	{
+	  string_slice attr = string_slice::tokenize (&str, separators);
+	  attr = attr.strip ();
+	  if (attr == "default" && default_count)
+	(*default_count)++;
+	  versions.safe_push (attr);
+	}
+}
+  return versions;
+}
+
+/* Returns an auto_vec of string_slices containing the version strings from
+   the target_clone attribute from DECL.  DEFAULT_COUNT is incremented for each
+   default version found.  */
+auto_vec
+get_clone_versions (const tree decl, int *default_count)
+{
+  tree attr = lookup_attribute ("target_clones", DECL_ATTRIBUTES (decl));
+  if (!attr)
+return auto_vec ();
+  tree arglist = TREE_VALUE (attr);
+  return get_clone_attr_versions (arglist, default_count);
+}
+
 void
 tree_cc_finalize (void)
 {
diff --git a/gcc/tree.h b/gcc/tree.h
index 21f3cd5525c..70541070c40 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -22,6 +22,7 @@ along with GCC; see the file COPYING3.  If not see
 
 #include "tree-core.h"
 #include "options.h"
+#include "vec.h"
 
 /* Convert a target-independent built-in function code to a combined_fn.  */
 
@@ -7035,5 +7036,7 @@ extern unsigned fndecl_dealloc_argno (tree);
 extern tree get_attr_nonstring_decl (tree, tree * = NULL);
 
 extern int get_target_clone_attr_len (tree);
+auto_vec get_clone_versions (const tree, int * = NULL);
+auto_vec get_clone_attr_versions (const tree, int * = NULL);
 
 #endif  /* GCC_TREE_H  */


[PATCH v2 06/16] Change function versions to be implicitly ordered.

2025-02-17 Thread Alfie Richards

This changes function version structures to maintain the default version
as the first declaration in the linked data structures by giving priority
to the set containing the default when constructing the structure.

This allows for removing logic for moving the default to the first
position which was duplicated across target specific code and enables
easier reasoning about function sets when checking for a default.

gcc/ChangeLog:

* cgraph.cc (cgraph_node::record_function_versions): Update to
implicitly keep default first.
* config/aarch64/aarch64.cc (aarch64_get_function_versions_dispatcher):
Remove reordering.
* config/i386/i386-features.cc (ix86_get_function_versions_dispatcher):
Remove reordering.
* config/riscv/riscv.cc (riscv_get_function_versions_dispatcher):
Remove reordering.
* config/rs6000/rs6000.cc (rs6000_get_function_versions_dispatcher):
Remove reordering.
---
 gcc/cgraph.cc| 27 -
 gcc/config/aarch64/aarch64.cc| 37 +++-
 gcc/config/i386/i386-features.cc | 33 -
 gcc/config/riscv/riscv.cc| 41 +++-
 gcc/config/rs6000/rs6000.cc  | 35 +--
 5 files changed, 49 insertions(+), 124 deletions(-)

diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
index d0b19ad850e..bf6b43d00db 100644
--- a/gcc/cgraph.cc
+++ b/gcc/cgraph.cc
@@ -247,7 +247,9 @@ cgraph_node::record_function_versions (tree decl1, tree decl2)
   decl1_v = decl1_node->function_version ();
   decl2_v = decl2_node->function_version ();
 
-  if (decl1_v != NULL && decl2_v != NULL)
+  /* If the nodes are already linked, skip.  */
+  if ((decl1_v != NULL && (decl1_v->next || decl1_v->prev))
+  && (decl2_v != NULL && (decl2_v->next || decl2_v->prev)))
 return;
 
   if (decl1_v == NULL)
@@ -256,18 +258,31 @@ cgraph_node::record_function_versions (tree decl1, tree decl2)
   if (decl2_v == NULL)
 decl2_v = decl2_node->insert_new_function_version ();
 
-  /* Chain decl2_v and decl1_v.  All semantically identical versions
- will be chained together.  */
+  gcc_assert (decl1_v);
+  gcc_assert (decl2_v);
 
   before = decl1_v;
   after = decl2_v;
 
+  /* Go to first after node.  */
+  while (after->prev != NULL)
+after = after->prev;
+
+  while (before->prev != NULL)
+before = before->prev;
+
+  /* Potentially swap the nodes to maintain the default always being in the
+ first position.  */
+  if (before->next
+  ? !is_function_default_version (before->this_node->decl)
+  : is_function_default_version (after->this_node->decl))
+std::swap (before, after);
+
+  /* Go to last node of before.  */
   while (before->next != NULL)
 before = before->next;
 
-  while (after->prev != NULL)
-after= after->prev;
-
+  /* Chain decl2_v and decl1_v.  */
   before->next = after;
   after->prev = before;
 }
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index f5f23f6ff4b..418f3039329 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -20650,7 +20650,6 @@ aarch64_get_function_versions_dispatcher (void *decl)
   struct cgraph_node *node = NULL;
   struct cgraph_node *default_node = NULL;
   struct cgraph_function_version_info *node_v = NULL;
-  struct cgraph_function_version_info *first_v = NULL;
 
   tree dispatch_decl = NULL;
 
@@ -20667,37 +20666,17 @@ aarch64_get_function_versions_dispatcher (void *decl)
   if (node_v->dispatcher_resolver != NULL)
 return node_v->dispatcher_resolver;
 
-  /* Find the default version and make it the first node.  */
-  first_v = node_v;
-  /* Go to the beginning of the chain.  */
-  while (first_v->prev != NULL)
-first_v = first_v->prev;
-  default_version_info = first_v;
-  while (default_version_info != NULL)
-{
-  if (get_feature_mask_for_version
-	(default_version_info->this_node->decl) == 0ULL)
-	break;
-  default_version_info = default_version_info->next;
-}
-
-  /* If there is no default node, just return NULL.  */
-  if (default_version_info == NULL)
-return NULL;
-
-  /* Make default info the first node.  */
-  if (first_v != default_version_info)
-{
-  default_version_info->prev->next = default_version_info->next;
-  if (default_version_info->next)
-	default_version_info->next->prev = default_version_info->prev;
-  first_v->prev = default_version_info;
-  default_version_info->next = first_v;
-  default_version_info->prev = NULL;
-}
+  /* The default node is always the beginning of the chain.  */
+  default_version_info = node_v;
+  while (default_version_info->prev != NULL)
+default_version_info = default_version_info->prev;
 
   default_node = default_version_info->this_node;
 
+  /* If there is no default version, just return NULL.  */
+  if (!is_function_default_version (default_node->decl))
+return NULL;
+
   if (targetm.has_i

[PATCH v2 05/16] Update is_function_default_version to work with target_version.

2025-02-17 Thread Alfie Richards

Notably this respects target_version semantics where an unannotated
function can be the default version.

gcc/ChangeLog:

* attribs.cc (is_function_default_version): Add target_version logic.
---
 gcc/attribs.cc | 27 ---
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/gcc/attribs.cc b/gcc/attribs.cc
index 56dd18c2fa8..f6667839c01 100644
--- a/gcc/attribs.cc
+++ b/gcc/attribs.cc
@@ -1279,18 +1279,31 @@ make_dispatcher_decl (const tree decl)
   return func_decl;
 }
 
-/* Returns true if DECL is multi-versioned using the target attribute, and this
-   is the default version.  This function can only be used for targets that do
-   not support the "target_version" attribute.  */
+/* Returns true if DECL a multiversioned default.
+   With the target attribute semantics, returns true if the function is marked
+   as default with the target version.
+   With the target_version attribute semantics, returns true if the function
+   is either not annotated, or annotated as default.  */
 
 bool
 is_function_default_version (const tree decl)
 {
-  if (TREE_CODE (decl) != FUNCTION_DECL
-  || !DECL_FUNCTION_VERSIONED (decl))
+  tree attr;
+  if (TREE_CODE (decl) != FUNCTION_DECL)
 return false;
-  tree attr = lookup_attribute ("target", DECL_ATTRIBUTES (decl));
-  gcc_assert (attr);
+  if (TARGET_HAS_FMV_TARGET_ATTRIBUTE)
+{
+  if (!DECL_FUNCTION_VERSIONED (decl))
+	return false;
+  attr = lookup_attribute ("target", DECL_ATTRIBUTES (decl));
+  gcc_assert (attr);
+}
+  else
+{
+  attr = lookup_attribute ("target_version", DECL_ATTRIBUTES (decl));
+  if (!attr)
+	return true;
+}
   attr = TREE_VALUE (TREE_VALUE (attr));
   return (TREE_CODE (attr) == STRING_CST
 	  && strcmp (TREE_STRING_POINTER (attr), "default") == 0);


[PATCH] tree-optimization/98845 - ICE with tail-merging and DCE/DSE disabled

2025-02-17 Thread Richard Biener
The following shows that tail-merging will make dead SSA defs live
in paths where it wasn't before, possibly introducing UB or as
in this case, uses of abnormals that eventually fail coalescing
later.  The fix is to register such defs for stmt comparison.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

PR tree-optimization/98845
* tree-ssa-tail-merge.cc (stmt_local_def): Consider a
def with no uses not local.

* gcc.dg/pr98845.c: New testcase.
---
 gcc/testsuite/gcc.dg/pr98845.c | 33 +
 gcc/tree-ssa-tail-merge.cc |  8 
 2 files changed, 41 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr98845.c

diff --git a/gcc/testsuite/gcc.dg/pr98845.c b/gcc/testsuite/gcc.dg/pr98845.c
new file mode 100644
index 000..074c979678f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr98845.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-tree-dce -fno-tree-dse" } */
+
+int n;
+
+__attribute__ ((returns_twice)) void
+foo (void);
+
+void
+bar (void);
+
+void
+quux (int x)
+{
+  if (x)
+++x;
+  else
+{
+  if (n)
+{
+  x = 1;
+  foo ();
+}
+  else
+bar ();
+
+  if (n)
+{
+  ++x;
+  ++n;
+}
+}
+}
diff --git a/gcc/tree-ssa-tail-merge.cc b/gcc/tree-ssa-tail-merge.cc
index d897970079c..857e91c206b 100644
--- a/gcc/tree-ssa-tail-merge.cc
+++ b/gcc/tree-ssa-tail-merge.cc
@@ -336,10 +336,13 @@ stmt_local_def (gimple *stmt)
 
   def_bb = gimple_bb (stmt);
 
+  bool any_use = false;
   FOR_EACH_IMM_USE_FAST (use_p, iter, val)
 {
   if (is_gimple_debug (USE_STMT (use_p)))
continue;
+
+  any_use = true;
   bb = gimple_bb (USE_STMT (use_p));
   if (bb == def_bb)
continue;
@@ -351,6 +354,11 @@ stmt_local_def (gimple *stmt)
   return false;
 }
 
+  /* When there is no use avoid making the stmt live on other paths.
+ This can happen with DCE disabled or not done as seen in PR98845.  */
+  if (!any_use)
+return false;
+
   return true;
 }
 
-- 
2.43.0


[PATCH v2 01/16] Add PowerPC FMV symbol tests.

2025-02-17 Thread Alfie Richards

This tests the mangling of function assembly names when annotated with
target_clones attributes.

gcc/testsuite/ChangeLog:

* g++.target/powerpc/mvc-symbols1.C: New test.
* g++.target/powerpc/mvc-symbols2.C: New test.
* g++.target/powerpc/mvc-symbols3.C: New test.
* g++.target/powerpc/mvc-symbols4.C: New test.

Co-authored-by: Alfie Richards 
---
 .../g++.target/powerpc/mvc-symbols1.C | 47 +++
 .../g++.target/powerpc/mvc-symbols2.C | 35 ++
 .../g++.target/powerpc/mvc-symbols3.C | 41 
 .../g++.target/powerpc/mvc-symbols4.C | 29 
 4 files changed, 152 insertions(+)
 create mode 100644 gcc/testsuite/g++.target/powerpc/mvc-symbols1.C
 create mode 100644 gcc/testsuite/g++.target/powerpc/mvc-symbols2.C
 create mode 100644 gcc/testsuite/g++.target/powerpc/mvc-symbols3.C
 create mode 100644 gcc/testsuite/g++.target/powerpc/mvc-symbols4.C

diff --git a/gcc/testsuite/g++.target/powerpc/mvc-symbols1.C b/gcc/testsuite/g++.target/powerpc/mvc-symbols1.C
new file mode 100644
index 000..9424382bf14
--- /dev/null
+++ b/gcc/testsuite/g++.target/powerpc/mvc-symbols1.C
@@ -0,0 +1,47 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O0" } */
+
+__attribute__((target_clones("default", "cpu=power6", "cpu=power6x")))
+int foo ()
+{
+  return 1;
+}
+
+__attribute__((target_clones("cpu=power6x", "cpu=power6", "default")))
+int foo (int)
+{
+  return 2;
+}
+
+int bar()
+{
+  return foo ();
+}
+
+int bar(int x)
+{
+  return foo (x);
+}
+
+/* { dg-final { scan-assembler-times "\n_Z3foov\.default:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.cpu_power6:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.cpu_power6x:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.resolver:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\tbl _Z3foov\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.type\t_Z3foov, @gnu_indirect_function\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.set\t_Z3foov,_Z3foov\.resolver\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.quad\t_Z3foov\.default\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.quad\t_Z3foov\.cpu_power6\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.quad\t_Z3foov\.cpu_power6x\n" 0 } } */
+
+/* { dg-final { scan-assembler-times "\n_Z3fooi\.default:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3fooi\.cpu_power6:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3fooi\.cpu_power6x:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3fooi\.resolver:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\tbl _Z3fooi\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.type\t_Z3fooi, @gnu_indirect_function\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.set\t_Z3fooi,_Z3fooi\.resolver\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.quad\t_Z3fooi\.default\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.quad\t_Z3fooi\.cpu_power6\n" 0 } } */
+/* { dg-final { scan-assembler-times "\n\t\.quad\t_Z3fooi\.cpu_power6x\n" 1 } } */
diff --git a/gcc/testsuite/g++.target/powerpc/mvc-symbols2.C b/gcc/testsuite/g++.target/powerpc/mvc-symbols2.C
new file mode 100644
index 000..edf54480efd
--- /dev/null
+++ b/gcc/testsuite/g++.target/powerpc/mvc-symbols2.C
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O0" } */
+
+__attribute__((target_clones("default", "cpu=power6", "cpu=power6x")))
+int foo ()
+{
+  return 1;
+}
+
+__attribute__((target_clones("cpu=power6x", "cpu=power6", "default")))
+int foo (int)
+{
+  return 2;
+}
+
+/* { dg-final { scan-assembler-times "\n_Z3foov\.default:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.cpu_power6:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.cpu_power6x:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.resolver:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.type\t_Z3foov, @gnu_indirect_function\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.set\t_Z3foov,_Z3foov\.resolver\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.quad\t_Z3foov\.default\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.quad\t_Z3foov\.cpu_power6\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.quad\t_Z3foov\.cpu_power6x\n" 0 } } */
+
+/* { dg-final { scan-assembler-times "\n_Z3fooi\.default:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3fooi\.cpu_power6:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3fooi\.cpu_power6x:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3fooi\.resolver:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.type\t_Z3fooi, @gnu_indirect_function\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.set\t_Z3fooi,_Z3fooi\.resolver\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.quad\t_Z3fooi\.default\n" 1 } } */
+/* 

[PATCH v2 00/16] FMV refactor and ACLE compliance.

2025-02-17 Thread Alfie Richards
Hello all,

Thank you for the feedback.

There are some minor changes for this version:

* Correctly attributed the symbol test patches to Andrew Calotti.
* Changed the recording of the assembly name to be done by
  insert_new_function_version. To me this seems a much more natural time to
  do this, as it's where the function_version_info structure is being
  initialized. It also avoids unnecessary early creation of cgraph structures
  which was causing issues.
* Change the handling of a lone "default" annotated function definition to
  be mangled and emit an alias from dispatched symbol to the default function
  (as discussed with Richard Sandiford).
* Addressed additional Richard Sandifords comments.
* Merged the removal of target_clone splitting code into the refactor of naming
  mostly to avoid unused function warnings causing the build to fail.

This has again be regression tested and bootstrapped for aarch64-none-linux-gnu
and x86_64-unknown-linux-gnu (more thoroughly this time).
Cross compiled and for the FMV tests for riscv and powerpc.

Kind regards,
Alfie Richards

Alfie Richards (14):
  Add string_slice class.
  Remove unnecessary `record` argument from maybe_version_functions.
  Update is_function_default_version to work with target_version.
  Change function versions to be implicitly ordered.
  Add version of make_attribute supporting string_slice.
  Add get_clone_versions function.
  Add assembler_name to cgraph_function_version_info.
  Add dispatcher_resolver_function and is_target_clone to cgraph_node.
  Add clone_identifier function.
  Refactor FMV name mangling.
  Change target_version semantics to follow ACLE specification.
  Support mixing of target_clones and target_version for aarch64.
  Add error cases and tests for Aarch64 FMV.
  Remove FMV beta warning.

Andrew Carlotti (2):
  Add PowerPC FMV symbol tests.
  Add x86 FMV symbol tests

 gcc/attribs.cc|  93 --
 gcc/attribs.h |   1 +
 gcc/c-family/c-attribs.cc |  11 +-
 gcc/cgraph.cc |  28 +-
 gcc/cgraph.h  |  13 +-
 gcc/cgraphclones.cc   |  16 +-
 gcc/cgraphunit.cc |   9 +
 gcc/config/aarch64/aarch64.cc | 253 +++-
 gcc/config/aarch64/aarch64.opt|   2 +-
 gcc/config/i386/i386-features.cc  | 141 +
 gcc/config/riscv/riscv.cc | 139 +++--
 gcc/config/rs6000/rs6000.cc   | 150 +++---
 gcc/cp/call.cc|  10 +
 gcc/cp/class.cc   |  15 +-
 gcc/cp/cp-gimplify.cc |   6 +-
 gcc/cp/cp-tree.h  |   2 +-
 gcc/cp/decl.cc|  37 ++-
 gcc/cp/typeck.cc  |  10 +
 gcc/doc/invoke.texi   |   5 +-
 gcc/ipa.cc|  11 +
 gcc/multiple_target.cc| 282 --
 gcc/testsuite/g++.target/aarch64/mv-1.C   |   5 +-
 .../g++.target/aarch64/mv-and-mvc-error1.C|   9 +
 .../g++.target/aarch64/mv-and-mvc-error2.C|   9 +
 .../g++.target/aarch64/mv-and-mvc-error3.C|   8 +
 .../g++.target/aarch64/mv-and-mvc1.C  |  37 +++
 .../g++.target/aarch64/mv-and-mvc2.C  |  28 ++
 .../g++.target/aarch64/mv-and-mvc3.C  |  40 +++
 .../g++.target/aarch64/mv-and-mvc4.C  |  37 +++
 gcc/testsuite/g++.target/aarch64/mv-error1.C  |  18 ++
 gcc/testsuite/g++.target/aarch64/mv-error2.C  |   9 +
 gcc/testsuite/g++.target/aarch64/mv-error3.C  |  12 +
 gcc/testsuite/g++.target/aarch64/mv-error4.C  |   9 +
 gcc/testsuite/g++.target/aarch64/mv-error5.C  |   8 +
 gcc/testsuite/g++.target/aarch64/mv-error6.C  |  20 ++
 gcc/testsuite/g++.target/aarch64/mv-error7.C  |  11 +
 gcc/testsuite/g++.target/aarch64/mv-error8.C  |  12 +
 gcc/testsuite/g++.target/aarch64/mv-pragma.C  |   1 -
 .../g++.target/aarch64/mv-symbols1.C  |   1 -
 .../g++.target/aarch64/mv-symbols10.C |  26 ++
 .../g++.target/aarch64/mv-symbols11.C |  29 ++
 .../g++.target/aarch64/mv-symbols12.C |  27 ++
 .../g++.target/aarch64/mv-symbols13.C |  27 ++
 .../g++.target/aarch64/mv-symbols2.C  |  13 +-
 .../g++.target/aarch64/mv-symbols3.C  |   7 +-
 .../g++.target/aarch64/mv-symbols4.C  |   7 +-
 .../g++.target/aarch64/mv-symbols5.C  |   7 +-
 .../g++.target/aarch64/mv-symbols6.C  |  24 ++
 .../g++.target/aarch64/mv-symbols7.C  |  47 +++
 .../g++.target/aarch64/mv-symbols8.C  |  45 +++
 .../g++.target/aarch64/mv-symbols9.C  |  42 +++
 .../g++.target/aarch64/mv-warning1.C  |   9 -
 gcc/testsuite/g++.target/aarch64/mvc-error1.C |   9 +
 gcc/testsuite/g++.target/aarch64/mvc-error2.C |   9 +
 .../g++.target/aarch64/mvc-symbols1.C  

Re: [PATCH] COBOL 6/15 156K lex: lexer

2025-02-17 Thread James K. Lowden
On Sat, 15 Feb 2025 23:32:37 -0500
David Malcolm  wrote:

> > +static bool
> > +is_word_char( char ch ) {
> > +  switch(ch) {
> > +  case '0' ... '9':
> > +  case 'a' ... 'z':
> > +  case 'A' ... 'Z':
> > +  case '$':
> > +  case '-':
> > +  case '_':
> > +    return true;
> > +  }
> > +  return false;
> > +}
> 
> Range based cases are a GCC extension IIRC, so this isn?t compatible
> with other C++ compilers

You recall correctly.  I thought I was being clever.  I will recast with 
ISALNUM from libiberty. 

> > +    gcc_assert(false);
> 
> This shows up in various places; use gcc_unreachable(); for this.

Did not know, thanks.  

--jkl


[PATCH] testsuite, powerpc: Fix vsx-vectorize-* after alignment peeling [PR118567]

2025-02-17 Thread Alex Coplan
Hi,

After the recent alignment peeling enhancements in the vectorizer we
started vectorizing the "checking" loops (that check for the right
result) in gcc.target/powerpc/vsx-vectorize-*.c,  thus skewing the
expected counts of various scan-dump-times tests (causing them to FAIL).
This adds #pragma GCC novector above the relevant loops to prevent them
from being vectorized, thereby fixing the test failures.

Tested with RUNTESTFLAGS="powerpc.exp=vsx-vectorize-*.c" on
powerpc64le-linux-gnu (cfarm29): no FAILs observed wtih the patch
applied.  OK for trunk?

Thanks,
Alex

gcc/testsuite/ChangeLog:

PR testsuite/118567
* gcc.target/powerpc/vsx-vectorize-1.c: Add #pragma to block
vectorization of result-checking loop.
* gcc.target/powerpc/vsx-vectorize-2.c: Likewise.
* gcc.target/powerpc/vsx-vectorize-3.c: Likewise.
* gcc.target/powerpc/vsx-vectorize-4.c: Likewise.
* gcc.target/powerpc/vsx-vectorize-5.c: Likewise.
* gcc.target/powerpc/vsx-vectorize-6.c: Likewise.
* gcc.target/powerpc/vsx-vectorize-7.c: Likewise.
* gcc.target/powerpc/vsx-vectorize-8.c: Likewise.
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-1.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-1.c
index a0e0496d345..927a523568b 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-1.c
@@ -30,6 +30,7 @@ main1 (struct foo * __restrict__ p)
 }
 
   /* check results:  */
+#pragma GCC novector
   for (i = 0; i < N; i++)
 {
   if (p->y[i] != x[i])
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-2.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-2.c
index 52c49b27cb7..84a63b3c42f 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-2.c
@@ -15,6 +15,7 @@ void bar (float *pd, float *pa, float *pb, float *pc)
   int i;
 
   /* check results:  */
+#pragma GCC novector
   for (i = 0; i < N; i++)
 {
   if (pa[i] != (pb[i] * pc[i]))
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-3.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-3.c
index f2f838a77fc..33054feef57 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-3.c
@@ -15,6 +15,7 @@ void bar (short *pa, short *pb, short *pc)
   int i;
 
   /* check results:  */
+#pragma GCC novector
   for (i = 0; i < N; i++)
 {
   if (pa[i] != (pb[i] * pc[i]))
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-4.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-4.c
index 8bf9dff1712..05262cf76d9 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-4.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-4.c
@@ -15,6 +15,7 @@ void bar (double *pa, double *pb, double *pc)
   int i;
 
   /* check results:  */
+#pragma GCC novector
   for (i = 0; i < N; i++)
 {
   if (pa[i] != (pb[i] * pc[i]))
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-5.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-5.c
index 1446e40b1d3..5478390f2ec 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-5.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-5.c
@@ -15,6 +15,7 @@ void bar (char *pa, char *pb, char *pc)
   int i;
 
   /* check results:  */
+#pragma GCC novector
   for (i = 0; i < N; i++)
 {
   if (pa[i] != (pb[i] + pc[i]))
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-6.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-6.c
index 6f49ccbbb6a..e1dc35bfd4d 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-6.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-6.c
@@ -15,6 +15,7 @@ void bar (double *pd, double *pa, double *pb, double *pc)
   int i;
 
   /* check results:  */
+#pragma GCC novector
   for (i = 0; i < N; i++)
 {
   if (pa[i] != (pb[i] * pc[i]))
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-7.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-7.c
index fde65a521d9..9a1ffd33881 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-7.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-7.c
@@ -15,6 +15,7 @@ void bar (int *pd, int *pa, int *pb, int *pc)
   int i;
 
   /* check results:  */
+#pragma GCC novector
   for (i = 0; i < N; i++)
 {
   if (pa[i] != (pb[i] * pc[i]))
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-8.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-8.c
index fb50cd54fd9..2f6fbfb443d 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-8.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-8.c
@@ -15,6 +15,7 @@ void bar (short *pd, short *pa, short *pb, short *pc)
   int i;
 
   /* check results:  */
+#pragma GCC novector
   for (i = 0; i < N; i++)
 {
   if (pa[i] != (pb[i] * pc[i]))


Re: [PATCH] arm: Increment LABEL_NUSES when using minipool_vector_label

2025-02-17 Thread Richard Earnshaw
On 17/02/2025 13:54, Richard Earnshaw (lists) wrote:
> On 17/02/2025 12:42, H.J. Lu wrote:
>> On Mon, Feb 17, 2025 at 7:08 PM Richard Earnshaw (lists)
>>  wrote:
>>>
>>> On 13/02/2025 21:43, H.J. Lu wrote:
 Increment LABEL_NUSES when using minipool_vector_label to avoid the zero
 use count on minipool_vector_label.

 PR target/118866
 * config/arm/arm.cc (arm_reorg): Increment LABEL_NUSES when
 using minipool_vector_label.

>>>
>>> Whilst this patch isn't wrong per se, I'm concerned that it's likely due to 
>>> something else violating the assumptions that a 
>>> TARGET_MACHINE_DEPENDENT_REORG pass implementation is entitled to make.  On 
>>> arm, the insertion of minipools in the code has to assume that the BB 
>>> layout won't change after that point (otherwise the offset calculations 
>>> will be wrong).  In fact, only changes that reduce code size within a 
>>> single basic block are going to be safe at this point.
>>>
>>> So what's changed to make this patch needed, and is it being run too late?
>>>
>>> R.
>>
>> I am working on:
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47253
>>
>> After a sibcall is folded into a conditional branch,  the old
>> conditional branch target label becomes used.   But the
>> basic block can't be removed if it is still reachable.  In this
>> case, I'd like to change final_scan_insn_1 to skip the
>> code label if its LABEL_NUSES is 0.  This means that
>> all referenced labels should have their LABEL_NUSES != 0.
>> I will find another way to deal with it.
>>
>> --
>> H.J.
> 
> Right, but why does detection of this need to be delayed until after md_reorg 
> has been run?  Can't the pass be done much earlier?  I'm just concerned about 
> the ordering of the passes, something that has the potential to re-arrange 
> code like this could mess up the assumptions that the md_reorg pass has to 
> make.
> 
> R.

For example, can you set TODO_cleanup_cfg on your pass and then run it, say 
right after the final BB re-org pass?

R.


Re: [PATCH] pair-fusion: A couple of fixes for sp updates [PR118429]

2025-02-17 Thread Alex Coplan
On 17/02/2025 16:15, Richard Sandiford wrote:
> Alex Coplan  writes:
> > Hi Richard,
> >
> > On 29/01/2025 16:44, Richard Sandiford wrote:
> >> The PR showed two issues with pair-fusion.  The first is that the pass
> >> treated stack pointer deallocations as ordinary register updates, and so
> >> might move them earlier than another stack access (through a different
> >> base register) that doesn't alias the pair candidate.
> >> 
> >> The simplest fix for that seems to be to prevent the stack deallocation
> >> from being moved.  This part might (or might not) be a latent source of
> >> wrong code and so worth backporting in some form.  (The patch as-is
> >> won't work for GCC 14.)
> >> 
> >> The second issue only started with r15-6551, which added a memory
> >> write to stack allocations and deallocations.  We should use the
> >> existing tombstone mechanism to preserve the associated memory
> >> definition.  (Deleting definitions immediately would have quadratic
> >> complexity in the worst case.)
> >
> > Thanks a lot for taking care of this while I was away.  I just have a
> > couple of comments below.
> >
> >> 
> >> Tested on aarch64-linux-gnu.  OK to install?
> >> 
> >> Richard
> >> 
> >> 
> >> gcc/
> >>PR rtl-optimization/118429
> >>* pair-fusion.cc (latest_hazard_before): Add an extra parameter
> >>to say whether the instruction is a load or a store.  If the
> >>instruction is not a load or store and has memory side effects,
> >>prevent it from being moved earlier.
> >>(pair_fusion::find_trailing_add): Update call accordingly.
> >>(pair_fusion_bb_info::fuse_pair): If the trailng addition had
> >>a memory side-effect, use a tombstone to preserve it.
> >> 
> >> gcc/testsuite/
> >>PR rtl-optimization/118429
> >>* gcc.c-torture/compile/pr118429.c: New test.
> >> ---
> >>  gcc/pair-fusion.cc| 45 ++-
> >>  .../gcc.c-torture/compile/pr118429.c  |  7 +++
> >>  2 files changed, 40 insertions(+), 12 deletions(-)
> >>  create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr118429.c
> >> 
> >> diff --git a/gcc/pair-fusion.cc b/gcc/pair-fusion.cc
> >> index b6643ca4812..602e572ab6c 100644
> >> --- a/gcc/pair-fusion.cc
> >> +++ b/gcc/pair-fusion.cc
> >> @@ -573,11 +573,13 @@ pair_fusion_bb_info::track_access (insn_info *insn, 
> >> bool load_p, rtx mem)
> >>  // If IGNORE_INSN is non-NULL, we should further ignore any hazards 
> >> arising
> >>  // from that insn.
> >>  //
> >> -// N.B. we ignore any defs/uses of memory here as we deal with that 
> >> separately,
> >> -// making use of alias disambiguation.
> >> +// IS_LOAD_STORE is true if INSN is one of the loads or stores in the
> >> +// candidate pair.  We ignore any defs/uses of memory in such instructions
> >> +// as we deal with that separately, making use of alias disambiguation.
> >>  static insn_info *
> >>  latest_hazard_before (insn_info *insn, rtx *ignore,
> >> -insn_info *ignore_insn = nullptr)
> >> +insn_info *ignore_insn = nullptr,
> >> +bool is_load_store = true)
> >>  {
> >>insn_info *result = nullptr;
> >>  
> >> @@ -588,6 +590,10 @@ latest_hazard_before (insn_info *insn, rtx *ignore,
> >>&& find_reg_note (insn->rtl (), REG_EH_REGION, NULL_RTX))
> >>  return insn->prev_nondebug_insn ();
> >>  
> >> +  if (!is_load_store
> >> +  && accesses_include_memory (insn->defs ()))
> >> +return insn->prev_nondebug_insn ();
> >
> > This seems like it might be a little too restrictive.  I agree that it's
> > a nice and simple way of solving the problem, but wouldn't it be enough
> > to prevent moving such accesses (stack deallocations) above the latest
> > preceding def or use of mem?  Certainly we don't want to start
> > attempting alias analysis here, but is the above suggestion not a happy
> > middle ground (between a simple solution and not overly restricting
> > optimisation)?
> 
> Would it help in practice though?  Although it is possible to combine
> a deallocation with preceding stores, that only happens for dead code,
> in which case the better optimisation is to delete the stores.
> If we're combining with loads, the loads would normally be restoring
> registers for the caller, in which case the loads could be moved
> forward to the deallocation (since nothing would use or clobber
> the loaded values between the two points).

I see.  I must admit that I don't immediately see why this can only
occur with dead stores, but I agree with your reasoning for the load
case, and it probably isn't worth spending much more time worrying about
optimising for such edge-case opportunities, anyway.  So all good :)

Thanks,
Alex

> 
> But yeah, I certainly don't object to doing that if you prefer.
> The patch is in now though (was asked to fix it while you were away),
> so it might be easier if you change it to a version that you're
> happier with (since you can self-approve it).
> 
> Thanks,
> Richar

Re: [PATCH] libgomp: avoid unused-variable-error when configured with CFLAGS=-DNDEBUG

2025-02-17 Thread shynur .
> As part of the Git commit message, please include a ChangeLog update (see
>  and 'git log').

I've written a new patch which is attached.

> Basically, 'contrib/gcc-changelog/git_check_commit.py --print-changelog'
> needs to accept your commit.

This time it passed the check!  Thank you, Thomas!

>> -   struct target_mem_desc *k_tgt = k->tgt;
>> -   bool is_tgt_unmapped = gomp_remove_var (devicep, k);
>> +   bool is_tgt_unmapped __attribute__((unused))
>> + = gomp_remove_var (devicep, k);
>>  /* It would be bad if TGT got unmapped while we're still iterating
>> over its LIST_COUNT, and also expect to use it in the following
>> code.  */
>>  assert (!is_tgt_unmapped
>> -   || k_tgt != tgt);
>> +   || k->tgt != tgt);
>>}
>
> Please check: if I remember correctly, it's no longer valid to dereference
> 'k->tgt' after 'gomp_remove_var (devicep, k);'?  (That's why we preserve the
> former as 'k_tgt'.)

My mistake.  I'm sorry.

0001-Add-__attribute__-unused-to-variables-used-only-in-a.patch
Description: 0001-Add-__attribute__-unused-to-variables-used-only-in-a.patch


[pushed] c++: add fixed test [PR96364]

2025-02-17 Thread Marek Polacek
Tested x86_64-pc-linux-gnu, applying to trunk.

-- >8 --
We were rejecting this, but the test compiles correctly since r14-6346.

PR c++/96364

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/gen-attrs-88.C: New test.
---
 gcc/testsuite/g++.dg/cpp0x/gen-attrs-88.C | 14 ++
 1 file changed, 14 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/gen-attrs-88.C

diff --git a/gcc/testsuite/g++.dg/cpp0x/gen-attrs-88.C 
b/gcc/testsuite/g++.dg/cpp0x/gen-attrs-88.C
new file mode 100644
index 000..f90b7a4661d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/gen-attrs-88.C
@@ -0,0 +1,14 @@
+// PR c++/96364
+// { dg-do compile { target c++14 } }
+
+auto a[[]] [[]]();
+auto a() {}
+
+void v[[]] [[]]();
+void v() {}
+
+void g()
+{
+  v();
+  return a();
+}

base-commit: dfd0ced98fcf62c4d24979b74c1d52334ff62bfc
-- 
2.48.1



Re: [PATCH] COBOL 3/15 92K bld: config and build machinery

2025-02-17 Thread James K. Lowden
On Sat, 15 Feb 2025 21:18:50 +
Sam James  wrote:

> > Subject: [PATCH] Add 'cobol' to 17 files
> 
> The commit message summary (first line) should say something like the
> email title, so 'cobol: bld: config and build machinery'.

Roger, will do next time.  

> > +YFLAGS = -Werror -Wmidrule-values -Wno-yacc \
> > +   --debug --verbose
> 
> Unconditional -Werror here looks off, should be based on the configure
> flag 

It's our intention here that parse.y and cdf.y be processed by Bison with no 
warnings.  Just those files specifically, not the compiler in general.  

> (--debug looks odd too).

It does look odd.  It's one way to enable tracing in Bison.  Tracing doesn't 
actually happen unless the user uses a command-line option for gcobol, but 
*can't* happen unless the capability is compiled in.  

I chose to externalize the feature using --debug.  I could have embedded the 
choice in the source code with %debug or "%define parse.trace".  

> so w/e.

(I'm unfamilar with that expression.)

> > +  [AC_MSG_ERROR([Can't find stdio.h.
> > +You must have a usable C system for the target already installed, at least
> > +including headers and, preferably, the library, before you can configure
> > +the Objective C runtime system.  If necessary, install gcc now with
> 
> cobol
>
> > +\`LANGUAGES=c', then the target library, then build with 
> > \`LANGUAGES=gcobol'.])])
> 
> I know this is copy-pasted (so objc will have to be fixed too), but this
> LANGUAGES= thing isn't correct and hasn't been for quite some time.

We will remove the message, as is done in libgrust.  

I will fix these mechanical issues in the next revision.  

--jkl



Re: [PATCH] libgcc: i386/linux-unwind.h: always rely on sys/ucontext.h

2025-02-17 Thread Roman Kagan
On Thu, Jan 02, 2025 at 04:32:17PM +0100, Roman Kagan wrote:
> When gcc is built for x86_64-linux-musl target, stack unwinding from
> within signal handler stops at the innermost signal frame.  The reason
> for this behaviro is that the signal trampoline is not accompanied with
> appropiate CFI directives, and the fallback path in libgcc to recognize
> it by the code sequence is only enabled for glibc except 2.0.  The
> latter is motivated by the lack of sys/ucontext.h in that glibc version.
> 
> Given that all relevant libc-s ship sys/ucontext.h for over a decade,
> and that other arches aren't shy of unconditionally using it, follow
> suit and remove the preprocessor condition, too.
> 
> Signed-off-by: Roman Kagan 
> ---
>  libgcc/config/i386/linux-unwind.h | 7 ---
>  1 file changed, 7 deletions(-)
> 
> diff --git a/libgcc/config/i386/linux-unwind.h 
> b/libgcc/config/i386/linux-unwind.h
> index fe316ee02cf2..8f37642bbf55 100644
> --- a/libgcc/config/i386/linux-unwind.h
> +++ b/libgcc/config/i386/linux-unwind.h
> @@ -33,12 +33,6 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> If not, see
>  
>  #ifndef inhibit_libc
>  
> -/* There's no sys/ucontext.h for glibc 2.0, so no
> -   signal-turned-exceptions for them.  There's also no configure-run for
> -   the target, so we can't check on (e.g.) HAVE_SYS_UCONTEXT_H.  Using the
> -   target libc version macro should be enough.  */
> -#if defined __GLIBC__ && !(__GLIBC__ == 2 && __GLIBC_MINOR__ == 0)
> -
>  #include 
>  #include 
>  
> @@ -199,5 +193,4 @@ x86_frob_update_context (struct _Unwind_Context *context,
>  }
>  
>  #endif /* ifdef __x86_64__  */
> -#endif /* not glibc 2.0 */
>  #endif /* ifdef inhibit_libc  */

Ping?

Roman.


[pushed] c++: extended temps and statement-exprs [PR118763]

2025-02-17 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

My last patch for 118856 broke the test for 118763 (which my testing didn't
catch, for some reason), because it effectively reverted Jakub's recent fix
(r15-7415) for that bug.  It seems we need a new flag to indicate internal
temporaries.

In that patch Jakub wondered if other uses of CLEANUP_EH_ONLY would have the
same issue with jumps out of a statement-expr, and indeed it seems that
maybe_push_temp_cleanup and now set_up_extended_ref_temp have the same
problem.  Since maybe_push_temp_cleanup already uses a flag, we can easily
stop setting CLEANUP_EH_ONLY there as well.  Since set_up_extended_ref_temp
doesn't, working around this issue there will be more involved.

PR c++/118856
PR c++/118763

gcc/cp/ChangeLog:

* cp-tree.h (TARGET_EXPR_INTERNAL_P): New.
* call.cc (extend_temps_r): Check it instead of CLEANUP_EH_ONLY.
* tree.cc (get_internal_target_expr): Set it instead.
* typeck2.cc (maybe_push_temp_cleanup): Don't set CLEANUP_EH_ONLY.

gcc/testsuite/ChangeLog:

* g++.dg/ext/stmtexpr29.C: New test.
---
 gcc/cp/cp-tree.h  |  6 ++
 gcc/cp/call.cc|  9 ++---
 gcc/cp/tree.cc|  4 +---
 gcc/cp/typeck2.cc |  1 -
 gcc/testsuite/g++.dg/ext/stmtexpr29.C | 27 +++
 5 files changed, 40 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/stmtexpr29.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 84bcbf29fa0..8866d5e2c2b 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -514,6 +514,7 @@ extern GTY(()) tree cp_global_trees[CPTI_MAX];
   OVL_LOOKUP_P (in OVERLOAD)
   LOOKUP_FOUND_P (in RECORD_TYPE, UNION_TYPE, ENUMERAL_TYPE, 
NAMESPACE_DECL)
   FNDECL_MANIFESTLY_CONST_EVALUATED (in FUNCTION_DECL)
+  TARGET_EXPR_INTERNAL_P (in TARGET_EXPR)
5: IDENTIFIER_VIRTUAL_P (in IDENTIFIER_NODE)
   FUNCTION_RVALUE_QUALIFIED (in FUNCTION_TYPE, METHOD_TYPE)
   CALL_EXPR_REVERSE_ARGS (in CALL_EXPR, AGGR_INIT_EXPR)
@@ -5608,6 +5609,11 @@ decl_template_parm_check (const_tree t, const char *f, 
int l, const char *fn)
 #define TARGET_EXPR_ELIDING_P(NODE) \
   TREE_LANG_FLAG_3 (TARGET_EXPR_CHECK (NODE))
 
+/* True if this TARGET_EXPR is for holding an implementation detail like a
+   cleanup flag or loop index, and should be ignored by extend_all_temps.  */
+#define TARGET_EXPR_INTERNAL_P(NODE) \
+  TREE_LANG_FLAG_4 (TARGET_EXPR_CHECK (NODE))
+
 /* True if NODE is a TARGET_EXPR that just expresses a copy of its INITIAL; if
the initializer has void type, it's doing something more complicated.  */
 #define SIMPLE_TARGET_EXPR_P(NODE) \
diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 03130f80f86..be9b0cf62f1 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -14922,10 +14922,13 @@ extend_temps_r (tree *tp, int *walk_subtrees, void 
*data)
   if (TREE_CODE (*p) == TARGET_EXPR
   /* An eliding TARGET_EXPR isn't a temporary at all.  */
   && !TARGET_EXPR_ELIDING_P (*p)
-  /* A TARGET_EXPR with CLEANUP_EH_ONLY is an artificial variable used
-during initialization, and need not be extended.  */
-  && !CLEANUP_EH_ONLY (*p))
+  /* A TARGET_EXPR with TARGET_EXPR_INTERNAL_P is an artificial variable
+used during initialization that need not be extended.  */
+  && !TARGET_EXPR_INTERNAL_P (*p))
 {
+  /* A CLEANUP_EH_ONLY expr should also have TARGET_EXPR_INTERNAL_P.  */
+  gcc_checking_assert (!CLEANUP_EH_ONLY (*p));
+
   tree subinit = NULL_TREE;
   tree slot = TARGET_EXPR_SLOT (*p);
   *p = set_up_extended_ref_temp (d->decl, *p, d->cleanups, &subinit,
diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
index 611930b3c28..5628a576f01 100644
--- a/gcc/cp/tree.cc
+++ b/gcc/cp/tree.cc
@@ -984,9 +984,7 @@ get_internal_target_expr (tree init)
   init = convert_bitfield_to_declared_type (init);
   tree t = build_target_expr_with_type (init, TREE_TYPE (init),
tf_warning_or_error);
-  /* No internal variable should have a cleanup on the normal path, and
- extend_temps_r checks this flag to decide whether to extend.  */
-  CLEANUP_EH_ONLY (t) = true;
+  TARGET_EXPR_INTERNAL_P (t) = true;
   return t;
 }
 
diff --git a/gcc/cp/typeck2.cc b/gcc/cp/typeck2.cc
index 2555e9c1b64..1adc05aa86d 100644
--- a/gcc/cp/typeck2.cc
+++ b/gcc/cp/typeck2.cc
@@ -459,7 +459,6 @@ maybe_push_temp_cleanup (tree sub, vec **flags)
 {
   tree tx = get_internal_target_expr (boolean_true_node);
   tree flag = TARGET_EXPR_SLOT (tx);
-  CLEANUP_EH_ONLY (tx) = true;
   TARGET_EXPR_CLEANUP (tx) = build3 (COND_EXPR, void_type_node,
 flag, cleanup, void_node);
   add_stmt (tx);
diff --git a/gcc/testsuite/g++.dg/ext/stmtexpr29.C 
b/gcc/testsuite/g++.dg/ext/stmtexpr29.C
new file mode 100644
index 0

Re: [PATCH] COBOL 5/15 380K hdr: header files

2025-02-17 Thread James K. Lowden
On Sat, 15 Feb 2025 21:30:16 +
Sam James  wrote:

> > + * This stand-in for std::regex was written because the
> > implementation provided
> > + * by the GCC libstdc++ in GCC 11 proved too slow, where "slow"
> > means "appears
> > + * not to terminate".  Some invocations of std::regex_search took
> > over 5
> 
> Is this still the case now in GCC trunk (15)? Is there a bug report to
> link to in the comment if so?

I didn't pursue a bug report for this problem, so as not to try to boil the 
ocean. 

AFAIK, the poor performance of std::regex is widely acknowledged and is somehow 
a feature of how it's defined.  Jonathan Wakely understands the problem better 
than I do.  

Although under no obligation to use std::regex, I thought I'd try it out and, 
honestly, it's not a bad interface.  But the performance was awful.  It was 
easy to re-implement what I needed from std::regex in terms of regex(3), and 
left the door open to revert simply by changing "using namespace dts".  

Is the state of gcc-15 relevant, though?  gcc is frequently built using 
whatever C++ compiler is installed.  If my understanding is correct, to rely on 
the installed std::regex is just to set a trap for the user.  

--jkl


Re: [PATCH] arm: Increment LABEL_NUSES when using minipool_vector_label

2025-02-17 Thread Richard Earnshaw (lists)
On 17/02/2025 12:42, H.J. Lu wrote:
> On Mon, Feb 17, 2025 at 7:08 PM Richard Earnshaw (lists)
>  wrote:
>>
>> On 13/02/2025 21:43, H.J. Lu wrote:
>>> Increment LABEL_NUSES when using minipool_vector_label to avoid the zero
>>> use count on minipool_vector_label.
>>>
>>> PR target/118866
>>> * config/arm/arm.cc (arm_reorg): Increment LABEL_NUSES when
>>> using minipool_vector_label.
>>>
>>
>> Whilst this patch isn't wrong per se, I'm concerned that it's likely due to 
>> something else violating the assumptions that a 
>> TARGET_MACHINE_DEPENDENT_REORG pass implementation is entitled to make.  On 
>> arm, the insertion of minipools in the code has to assume that the BB layout 
>> won't change after that point (otherwise the offset calculations will be 
>> wrong).  In fact, only changes that reduce code size within a single basic 
>> block are going to be safe at this point.
>>
>> So what's changed to make this patch needed, and is it being run too late?
>>
>> R.
> 
> I am working on:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47253
> 
> After a sibcall is folded into a conditional branch,  the old
> conditional branch target label becomes used.   But the
> basic block can't be removed if it is still reachable.  In this
> case, I'd like to change final_scan_insn_1 to skip the
> code label if its LABEL_NUSES is 0.  This means that
> all referenced labels should have their LABEL_NUSES != 0.
> I will find another way to deal with it.
> 
> --
> H.J.

Right, but why does detection of this need to be delayed until after md_reorg 
has been run?  Can't the pass be done much earlier?  I'm just concerned about 
the ordering of the passes, something that has the potential to re-arrange code 
like this could mess up the assumptions that the md_reorg pass has to make.

R.


Re: [PATCH] pair-fusion: A couple of fixes for sp updates [PR118429]

2025-02-17 Thread Alex Coplan
Hi Richard,

On 29/01/2025 16:44, Richard Sandiford wrote:
> The PR showed two issues with pair-fusion.  The first is that the pass
> treated stack pointer deallocations as ordinary register updates, and so
> might move them earlier than another stack access (through a different
> base register) that doesn't alias the pair candidate.
> 
> The simplest fix for that seems to be to prevent the stack deallocation
> from being moved.  This part might (or might not) be a latent source of
> wrong code and so worth backporting in some form.  (The patch as-is
> won't work for GCC 14.)
> 
> The second issue only started with r15-6551, which added a memory
> write to stack allocations and deallocations.  We should use the
> existing tombstone mechanism to preserve the associated memory
> definition.  (Deleting definitions immediately would have quadratic
> complexity in the worst case.)

Thanks a lot for taking care of this while I was away.  I just have a
couple of comments below.

> 
> Tested on aarch64-linux-gnu.  OK to install?
> 
> Richard
> 
> 
> gcc/
>   PR rtl-optimization/118429
>   * pair-fusion.cc (latest_hazard_before): Add an extra parameter
>   to say whether the instruction is a load or a store.  If the
>   instruction is not a load or store and has memory side effects,
>   prevent it from being moved earlier.
>   (pair_fusion::find_trailing_add): Update call accordingly.
>   (pair_fusion_bb_info::fuse_pair): If the trailng addition had
>   a memory side-effect, use a tombstone to preserve it.
> 
> gcc/testsuite/
>   PR rtl-optimization/118429
>   * gcc.c-torture/compile/pr118429.c: New test.
> ---
>  gcc/pair-fusion.cc| 45 ++-
>  .../gcc.c-torture/compile/pr118429.c  |  7 +++
>  2 files changed, 40 insertions(+), 12 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr118429.c
> 
> diff --git a/gcc/pair-fusion.cc b/gcc/pair-fusion.cc
> index b6643ca4812..602e572ab6c 100644
> --- a/gcc/pair-fusion.cc
> +++ b/gcc/pair-fusion.cc
> @@ -573,11 +573,13 @@ pair_fusion_bb_info::track_access (insn_info *insn, 
> bool load_p, rtx mem)
>  // If IGNORE_INSN is non-NULL, we should further ignore any hazards arising
>  // from that insn.
>  //
> -// N.B. we ignore any defs/uses of memory here as we deal with that 
> separately,
> -// making use of alias disambiguation.
> +// IS_LOAD_STORE is true if INSN is one of the loads or stores in the
> +// candidate pair.  We ignore any defs/uses of memory in such instructions
> +// as we deal with that separately, making use of alias disambiguation.
>  static insn_info *
>  latest_hazard_before (insn_info *insn, rtx *ignore,
> -   insn_info *ignore_insn = nullptr)
> +   insn_info *ignore_insn = nullptr,
> +   bool is_load_store = true)
>  {
>insn_info *result = nullptr;
>  
> @@ -588,6 +590,10 @@ latest_hazard_before (insn_info *insn, rtx *ignore,
>&& find_reg_note (insn->rtl (), REG_EH_REGION, NULL_RTX))
>  return insn->prev_nondebug_insn ();
>  
> +  if (!is_load_store
> +  && accesses_include_memory (insn->defs ()))
> +return insn->prev_nondebug_insn ();

This seems like it might be a little too restrictive.  I agree that it's
a nice and simple way of solving the problem, but wouldn't it be enough
to prevent moving such accesses (stack deallocations) above the latest
preceding def or use of mem?  Certainly we don't want to start
attempting alias analysis here, but is the above suggestion not a happy
middle ground (between a simple solution and not overly restricting
optimisation)?

> +
>// Return true if we registered the hazard.
>auto hazard = [&](insn_info *h) -> bool
>  {
> @@ -1238,7 +1244,7 @@ pair_fusion::find_trailing_add (insn_info *insns[2],
>insns[0]->uid (), insns[1]->uid ());
>  };
>  
> -  insn_info *hazard = latest_hazard_before (cand, nullptr, insns[1]);
> +  insn_info *hazard = latest_hazard_before (cand, nullptr, insns[1], false);
>if (!hazard || *hazard <= *pair_dst)
>  {
>if (dump_file)
> @@ -1633,7 +1639,7 @@ pair_fusion_bb_info::fuse_pair (bool load_p,
>insn_info *insns[2] = { first, second };
>  
>auto_vec changes;
> -  auto_vec tombstone_uids (2);
> +  auto_vec tombstone_uids;
>  
>rtx pats[2] = {
>  PATTERN (first->rtl ()),
> @@ -1824,6 +1830,16 @@ pair_fusion_bb_info::fuse_pair (bool load_p,
>  validate_change (rti, ®_NOTES (rti), reg_notes, true);
>};
>  
> +  // Turn CHANGE into a memory definition tombstone.
> +  auto make_tombstone = [&](insn_change *change)
> +{
> +  tombstone_uids.quick_push (change->insn ()->uid ());
> +  rtx_insn *rti = change->insn ()->rtl ();
> +  validate_change (rti, &PATTERN (rti), gen_tombstone (), true);
> +  validate_change (rti, ®_NOTES (rti), NULL_RTX, true);
> +  change->new_uses = use_array ();
> +};
> +
>if (load_p)
>  {
>

Re: [PATCH] pair-fusion: A couple of fixes for sp updates [PR118429]

2025-02-17 Thread Richard Sandiford
Alex Coplan  writes:
> Hi Richard,
>
> On 29/01/2025 16:44, Richard Sandiford wrote:
>> The PR showed two issues with pair-fusion.  The first is that the pass
>> treated stack pointer deallocations as ordinary register updates, and so
>> might move them earlier than another stack access (through a different
>> base register) that doesn't alias the pair candidate.
>> 
>> The simplest fix for that seems to be to prevent the stack deallocation
>> from being moved.  This part might (or might not) be a latent source of
>> wrong code and so worth backporting in some form.  (The patch as-is
>> won't work for GCC 14.)
>> 
>> The second issue only started with r15-6551, which added a memory
>> write to stack allocations and deallocations.  We should use the
>> existing tombstone mechanism to preserve the associated memory
>> definition.  (Deleting definitions immediately would have quadratic
>> complexity in the worst case.)
>
> Thanks a lot for taking care of this while I was away.  I just have a
> couple of comments below.
>
>> 
>> Tested on aarch64-linux-gnu.  OK to install?
>> 
>> Richard
>> 
>> 
>> gcc/
>>  PR rtl-optimization/118429
>>  * pair-fusion.cc (latest_hazard_before): Add an extra parameter
>>  to say whether the instruction is a load or a store.  If the
>>  instruction is not a load or store and has memory side effects,
>>  prevent it from being moved earlier.
>>  (pair_fusion::find_trailing_add): Update call accordingly.
>>  (pair_fusion_bb_info::fuse_pair): If the trailng addition had
>>  a memory side-effect, use a tombstone to preserve it.
>> 
>> gcc/testsuite/
>>  PR rtl-optimization/118429
>>  * gcc.c-torture/compile/pr118429.c: New test.
>> ---
>>  gcc/pair-fusion.cc| 45 ++-
>>  .../gcc.c-torture/compile/pr118429.c  |  7 +++
>>  2 files changed, 40 insertions(+), 12 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr118429.c
>> 
>> diff --git a/gcc/pair-fusion.cc b/gcc/pair-fusion.cc
>> index b6643ca4812..602e572ab6c 100644
>> --- a/gcc/pair-fusion.cc
>> +++ b/gcc/pair-fusion.cc
>> @@ -573,11 +573,13 @@ pair_fusion_bb_info::track_access (insn_info *insn, 
>> bool load_p, rtx mem)
>>  // If IGNORE_INSN is non-NULL, we should further ignore any hazards arising
>>  // from that insn.
>>  //
>> -// N.B. we ignore any defs/uses of memory here as we deal with that 
>> separately,
>> -// making use of alias disambiguation.
>> +// IS_LOAD_STORE is true if INSN is one of the loads or stores in the
>> +// candidate pair.  We ignore any defs/uses of memory in such instructions
>> +// as we deal with that separately, making use of alias disambiguation.
>>  static insn_info *
>>  latest_hazard_before (insn_info *insn, rtx *ignore,
>> -  insn_info *ignore_insn = nullptr)
>> +  insn_info *ignore_insn = nullptr,
>> +  bool is_load_store = true)
>>  {
>>insn_info *result = nullptr;
>>  
>> @@ -588,6 +590,10 @@ latest_hazard_before (insn_info *insn, rtx *ignore,
>>&& find_reg_note (insn->rtl (), REG_EH_REGION, NULL_RTX))
>>  return insn->prev_nondebug_insn ();
>>  
>> +  if (!is_load_store
>> +  && accesses_include_memory (insn->defs ()))
>> +return insn->prev_nondebug_insn ();
>
> This seems like it might be a little too restrictive.  I agree that it's
> a nice and simple way of solving the problem, but wouldn't it be enough
> to prevent moving such accesses (stack deallocations) above the latest
> preceding def or use of mem?  Certainly we don't want to start
> attempting alias analysis here, but is the above suggestion not a happy
> middle ground (between a simple solution and not overly restricting
> optimisation)?

Would it help in practice though?  Although it is possible to combine
a deallocation with preceding stores, that only happens for dead code,
in which case the better optimisation is to delete the stores.
If we're combining with loads, the loads would normally be restoring
registers for the caller, in which case the loads could be moved
forward to the deallocation (since nothing would use or clobber
the loaded values between the two points).

But yeah, I certainly don't object to doing that if you prefer.
The patch is in now though (was asked to fix it while you were away),
so it might be easier if you change it to a version that you're
happier with (since you can self-approve it).

Thanks,
Richard


[PATCH v2 14/16] Support mixing of target_clones and target_version for aarch64.

2025-02-17 Thread Alfie Richards

This patch adds support for the combination of target_clones and
target_version in the definition of a versioned function.

This patch changes is_function_default_version to consider a function
declaration annotated with target_clones containing default to be a
default version. It also changes the common_function_version hook to
consider two functions annotated with target_clones and/or
target_versions to be common if their specified versions don't overlap.

This takes advantage of refactoring done in previous patches changing
how target_clones are expanded.

This can be enabled for riscv by modifying
riscv_common_function_versions. Currently target_clone and target_version
declarations will always be considered duplicate preventing mixing
currently.

gcc/ChangeLog:

* attribs.cc (is_function_default_version): Add logic for target_clones
defining the default version.
* config/aarch64/aarch64.cc (aarch64_common_function_versions): Add
support for target_version and target_clone mixing.
* multiple_target.cc (expand_target_clones): Add support for the
default version not being part of a target_clones attribute.

gcc/c-family/ChangeLog:

* c-attribs.cc: Add support for target_version and target_clone mixing.

gcc/testsuite/ChangeLog:

* g++.target/aarch64/mv-and-mvc1.C: New test.
* g++.target/aarch64/mv-and-mvc2.C: New test.
* g++.target/aarch64/mv-and-mvc3.C: New test.
* g++.target/aarch64/mv-and-mvc4.C: New test.
---
 gcc/attribs.cc| 10 -
 gcc/c-family/c-attribs.cc |  9 +---
 gcc/config/aarch64/aarch64.cc | 45 ++-
 gcc/multiple_target.cc|  6 ++-
 .../g++.target/aarch64/mv-and-mvc1.C  | 38 
 .../g++.target/aarch64/mv-and-mvc2.C  | 29 
 .../g++.target/aarch64/mv-and-mvc3.C  | 41 +
 .../g++.target/aarch64/mv-and-mvc4.C  | 38 
 8 files changed, 204 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-and-mvc1.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-and-mvc2.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-and-mvc3.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-and-mvc4.C

diff --git a/gcc/attribs.cc b/gcc/attribs.cc
index d0f37d77098..2c402fb994d 100644
--- a/gcc/attribs.cc
+++ b/gcc/attribs.cc
@@ -1317,7 +1317,8 @@ make_dispatcher_decl (const tree decl)
With the target attribute semantics, returns true if the function is marked
as default with the target version.
With the target_version attribute semantics, returns true if the function
-   is either not annotated, or annotated as default.  */
+   is either not annotated, annotated as default, or is a target_clone
+   containing the default declaration.  */
 
 bool
 is_function_default_version (const tree decl)
@@ -1334,6 +1335,13 @@ is_function_default_version (const tree decl)
 }
   else
 {
+  if (lookup_attribute ("target_clones", DECL_ATTRIBUTES (decl)))
+	{
+	  int num_defaults = 0;
+	  get_clone_versions (decl, &num_defaults);
+	  return num_defaults > 0;
+	}
+
   attr = lookup_attribute ("target_version", DECL_ATTRIBUTES (decl));
   if (!attr)
 	return true;
diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 642d724f6c6..f793418d3f4 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -249,13 +249,6 @@ static const struct attribute_spec::exclusions attr_target_clones_exclusions[] =
   ATTR_EXCL ("always_inline", true, true, true),
   ATTR_EXCL ("target", TARGET_HAS_FMV_TARGET_ATTRIBUTE,
 	 TARGET_HAS_FMV_TARGET_ATTRIBUTE, TARGET_HAS_FMV_TARGET_ATTRIBUTE),
-  ATTR_EXCL ("target_version", true, true, true),
-  ATTR_EXCL (NULL, false, false, false),
-};
-
-static const struct attribute_spec::exclusions attr_target_version_exclusions[] =
-{
-  ATTR_EXCL ("target_clones", true, true, true),
   ATTR_EXCL (NULL, false, false, false),
 };
 
@@ -543,7 +536,7 @@ const struct attribute_spec c_common_gnu_attributes[] =
 			  attr_target_exclusions },
   { "target_version", 1, 1, true, false, false, false,
 			  handle_target_version_attribute,
-			  attr_target_version_exclusions },
+			  NULL },
   { "target_clones",  1, -1, true, false, false, false,
 			  handle_target_clones_attribute,
 			  attr_target_clones_exclusions },
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 6b2247be7e7..cbba250da59 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -20682,7 +20682,50 @@ aarch64_common_function_versions (tree fn1, tree fn2)
   || TREE_CODE (fn2) != FUNCTION_DECL)
 return false;
 
-  return (aarch64_compare_version_priority (fn1, fn2) != 0);
+  /* As this is symmetric, can remove the case where fn2 is target clone and
+   

Re: [PATCH] COBOL 7/15 492K par: parser

2025-02-17 Thread James K. Lowden
On Sat, 15 Feb 2025 23:35:16 -0500
David Malcolm  wrote:

On better messages ...

> +  if( ($$ & $2) == $2 ) {
> +error_msg(@2, "%s clause repeated", clause);
> +YYERROR;
> +  }
> 
> Obviously not needed for initial release, but it would be neat to have
> a fix-it hint here that deletes the repeated token (fixit hints are
> done via class rich_location, FWIW)

Noted, thanks, 

> +  if( $data_clause == redefines_clause_e ) {
> +error_msg(@2, "REDEFINES must appear "
> + "immediately after LEVEL and NAME");
> +YYERROR;
> +  }
> 
> A strict reading of our diagnostic guidelines suggests that all of
> these keywords in these messages should be in quotes, via %{ and %}, or
> via %qs.  But given that cobol has UPPERCASE KEYWORDS THAT ALREADY
> REALLY STAND OUT, maybe that?s overkill.

I endeavored to report every keyword in uppercase.  The user isn't required to 
use uppercase; COBOL is (despite the official name, heh) case-insensitive.  But 
the fact of the keyword is all we have.  The lexer doesn't capture how the user 
typed it in; it reports only the presence of the token.  

In the case of user-defined names, the actual name supplied is captured and 
reported literally.  So, the user could have e.g. 

001-Initialization Section.

where "Section" is a token whose input string is discarded, and 
"001-Initialization" is a user-defined name whose supplied form is preserved, 
and reported verbatim.  

With that in mind, I propose a policy that builds on your observation (for 
gcc-16, not today):  Report token names in uppercase, unquoted, and 
user-defined names vebatim, quoted.  

I have that filed under "tasks for an eager volunteer, but probably me".  More 
tedious than difficult.  

> +  error_msg(@2, "%s is binary NUMERIC type, "
> +   "incompatible with SIGN IS", field->name);
> 
> Again, this isn?t needed for the initial release, but GCCs diagnostics
> can have ?rules? associated with them, which can have URLs (see
> diagnostic-metadata.h)  Is there a useful public standard for Cobol
> with such rules that the output can link to?

There is no freely available COBOL standard.  IBM and Microfocus (and others) 
do publish their documentation on the  web, but the official standard is 
copyrighted and comes at a price.  Please write to your congressman.  

That said, it's been my ambition to tie every relevant message to the ISO 
standard in force at time of compilation.  To that end, I want to move all 
messages to a table keyed by ISO version and section number (or other, for 
-dialect option).  The caller would refer to the table by the key, and 
error_msg() et al. would report that information along with the message text.  

I don't know of another compiler that does that.  I don't mind showing them how 
it's done!  

> +auto name = nice_name_of($inspected->field);
> +if( !name[0] ) name = "its argument";
> +error_msg(@inspected, "INSPECT cannot write to %s", 
> name);
> 
> Building up messages in fragments is a problem for i18n.  Better to
> have an if/else guarding two separate calls to error_msg.  Use %qs in
> the one that uses name, so the name appears in quotes.

Understood, thanks. 

> +  if( $a.on_error && $a.not_error ) {
> +error_msg(@b, "too many ON EXCEPTION clauses");
> 
> Another thing not needed for the initial release - in general, if we?re
> complaining about something in the code being incompatible with other
> code we already saw, it?s good to issue a ?note? immediately after the
> ?error?, 

Agreed, thanks.  

> Not needed for the initial release, but I see a lot of naked ?new?,
> assigned to $$.  Could this be a std::unique_ptr, and use make_unique
> rather than new? Similarly for the return type of functions like
> new_reference and new_literal.  I suspect this would be a lot of work,
> though, and may run into snags (does yylval have to a POD?), so no
> obligation.

You may have already seen my lengthy defense of not worrying about every little 
string.  I would much rather waste 100 KB on unrecovered parser bits and bobs 
than spend one afternoon tracking down a double-free, or subjecting one user to 
a gcc_internal_error.  I would argue it's the only responsible choice, from a 
pareto optimal point of view.  

I hope that's persuasive.  If we're still in doubt, then we'll have to discuss 
what to do.  For sure, it would be a lot of work.  

yylval does have to be a POD.  If there's a contructor, IIRC I get an error 
message to the effect that the constructor was deleted.  But yylval can be a 
pointer to anything, because pointers can be copied.  

Whether or not std::unique_ptr fits under that aegis or not, I don't know.  It 
would take some doing to convince me to 

[PATCH v2] [testsuite] add x86 effective target

2025-02-17 Thread Alexandre Oliva
On Feb 13, 2025, Alexandre Oliva  wrote:

> @@ -14108,10 +14113,9 @@ proc dg-require-python-h { args } {
>  # Return 1 if the target supports heap-trampoline, 0 otherwise.
>  proc check_effective_target_heap_trampoline {} {
>  if { [istarget aarch64*-*-linux*]
> -  || [istarget i?86-*-darwin*]
> -  || [istarget x86_64-*-darwin*]
> -  || [istarget i?86-*-linux*]
> -  || [istarget x86_64-*-linux*] } {
> +  || { [check_effective_target_x86]
> +   && { [istarget *-*-darwin*]
> +|| [istarget *-*-linux*] } } } {
>   return 1
>  }
>  return 0

I used the wrong kind of brackets here, and missed the error that it
caused.  Here's a corrected patch, retested on x86_64-linux-gnu.
Ok to install?


I got tired of repeating the conditional that recognizes ia32 or
x86_64, and introduced 'x86' as a shorthand for that, adjusting all
occurrences in target-supports.exp, to set an example.  I found some
patterns that recognized i?86* and x86_64*, but I took those as likely
cut&pastos instead of trying to preserve those weirdnesses.


for  gcc/ChangeLog

* doc/sourcebuild.texi: Add x86 effective target.

for  gcc/testsuite/ChangeLog

* lib/target-supports.exp (check_effective_target_x86): New.
Replace all uses of i?86-*-* and x86_64-*-* in this file.
---
 gcc/doc/sourcebuild.texi  |3 +
 gcc/testsuite/lib/target-supports.exp |  188 +
 2 files changed, 99 insertions(+), 92 deletions(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 28338324f0724..d44c2e8cbe6a1 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -2798,6 +2798,9 @@ Target supports the execution of @code{user_msr} 
instructions.
 @item vect_cmdline_needed
 Target requires a command line argument to enable a SIMD instruction set.
 
+@item x86
+Target is ia32 or x86_64.
+
 @item xorsign
 Target supports the xorsign optab expansion.
 
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 9b5fbe5275613..fbeb2ad3dafa3 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -740,7 +740,7 @@ proc check_profiling_available { test_what } {
 }
 
 if { $test_what == "-fauto-profile" } {
-   if { !([istarget i?86-*-linux*] || [istarget x86_64-*-linux*]) } {
+   if { !([check_effective_target_x86] && [istarget *-*-linux*]) } {
verbose "autofdo only supported on linux"
return 0
}
@@ -2616,17 +2616,23 @@ proc remove_options_for_riscv_zvbb { flags } {
 return [add_options_for_riscv_z_ext zvbb $flags]
 }
 
+# Return 1 if the target is ia32 or x86_64.
+
+proc check_effective_target_x86 { } {
+if { ([istarget x86_64-*-*] || [istarget i?86-*-*]) } {
+   return 1
+} else {
+return 0
+}
+}
+
 # Return 1 if the target OS supports running SSE executables, 0
 # otherwise.  Cache the result.
 
 proc check_sse_os_support_available { } {
 return [check_cached_effective_target sse_os_support_available {
# If this is not the right target then we can skip the test.
-   if { !([istarget i?86-*-*] || [istarget x86_64-*-*]) } {
-   expr 0
-   } else {
-   expr 1
-   }
+   expr [check_effective_target_x86]
 }]
 }
 
@@ -2636,7 +2642,7 @@ proc check_sse_os_support_available { } {
 proc check_avx_os_support_available { } {
 return [check_cached_effective_target avx_os_support_available {
# If this is not the right target then we can skip the test.
-   if { !([istarget i?86-*-*] || [istarget x86_64-*-*]) } {
+   if { !([check_effective_target_x86]) } {
expr 0
} else {
# Check that OS has AVX and SSE saving enabled.
@@ -2659,7 +2665,7 @@ proc check_avx_os_support_available { } {
 proc check_avx512_os_support_available { } {
 return [check_cached_effective_target avx512_os_support_available {
# If this is not the right target then we can skip the test.
-   if { !([istarget i?86-*-*] || [istarget x86_64-*-*]) } {
+   if { !([check_effective_target_x86]) } {
expr 0
} else {
# Check that OS has AVX512, AVX and SSE saving enabled.
@@ -2682,7 +2688,7 @@ proc check_avx512_os_support_available { } {
 proc check_sse_hw_available { } {
 return [check_cached_effective_target sse_hw_available {
# If this is not the right target then we can skip the test.
-   if { !([istarget i?86-*-*] || [istarget x86_64-*-*]) } {
+   if { !([check_effective_target_x86]) } {
expr 0
} else {
check_runtime_nocache sse_hw_available {
@@ -2706,7 +2712,7 @@ proc check_sse_hw_available { } {
 proc check_sse2_hw_available { } {
 return [check_cached_effective_target sse2_hw_available {
# If this is not the right target then we can skip the test.
-   if { !([istarget i?86-*-*] || [istarget x86

[PATCH 2/2] libstdc++: Some concat_view bugfixes [PR115215, PR115218, LWG 4082]

2025-02-17 Thread Patrick Palka
Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

-- >8 --

- Use __builtin_unreachable to suppress a false-positive "control
  reaches end of non-void function" warning in the recursive lambda
  (which the existing tests failed to notice since test01 wasn't
  being called at runtime)
- Relax the constraints on views::concat in the single-argument case
  as per PR115215
- Add an input_range requirement to that same case as per LWG 4082
- In the const-converting constructor of concat_view's iterator,
  don't require the first iterator to be default constructible

PR libstdc++/115215
PR libstdc++/115218

libstdc++-v3/ChangeLog:

* include/std/ranges
(concat_view::iterator::_S_invoke_with_runtime_index): Use
__builtin_unreachable in recursive lambda to certify it always
exits via 'return'.
(concat_view::iterator::iterator): In the const-converting
constructor, direct initialize _M_it.
(views::_Concat::operator()): Adjust constraints in the
single-argument case as per LWG 4082.
* testsuite/std/ranges/concat/1.cc (test01): Call it at runtime
too.
(test04): New test.
---
 libstdc++-v3/include/std/ranges   | 28 +--
 libstdc++-v3/testsuite/std/ranges/concat/1.cc | 16 +++
 2 files changed, 30 insertions(+), 14 deletions(-)

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index 22e0c9cae44..a56dae43625 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -9919,6 +9919,7 @@ namespace ranges
return __f.template operator()<_Idx>();
  if constexpr (_Idx + 1 < sizeof...(_Vs))
return __self.template operator()<_Idx + 1>();
+ __builtin_unreachable();
}.template operator()<0>();
   }
 
@@ -9940,12 +9941,12 @@ namespace ranges
 constexpr
 iterator(iterator __it)
   requires _Const && (convertible_to, iterator_t> && ...)
-: _M_parent(__it._M_parent)
-{
-  _M_invoke_with_runtime_index([this, &__it]() {
-   _M_it.template emplace<_Idx>(std::get<_Idx>(std::move(__it._M_it)));
-  });
-}
+: _M_parent(__it._M_parent),
+  _M_it(_S_invoke_with_runtime_index([this, &__it]() {
+ return __base_iter(in_place_index<_Idx>,
+std::get<_Idx>(std::move(__it._M_it)));
+   }, __it._M_it.index()))
+{ }
 
 constexpr decltype(auto)
 operator*() const
@@ -10179,16 +10180,15 @@ namespace ranges
 
 struct _Concat
 {
-  template
-   requires __detail::__can_concat_view<_Ts...>
+  template<__detail::__can_concat_view... _Ts>
   constexpr auto
   operator() [[nodiscard]] (_Ts&&... __ts) const
-  {
-   if constexpr (sizeof...(_Ts) == 1)
- return views::all(std::forward<_Ts>(__ts)...);
-   else
- return concat_view(std::forward<_Ts>(__ts)...);
-  }
+  { return concat_view(std::forward<_Ts>(__ts)...); }
+
+  template
+  constexpr auto
+  operator() [[nodiscard]] (_Range&& __t) const
+  { return views::all(std::forward<_Range>(__t)); }
 };
 
 inline constexpr _Concat concat;
diff --git a/libstdc++-v3/testsuite/std/ranges/concat/1.cc 
b/libstdc++-v3/testsuite/std/ranges/concat/1.cc
index e5d10f476e9..16721912a37 100644
--- a/libstdc++-v3/testsuite/std/ranges/concat/1.cc
+++ b/libstdc++-v3/testsuite/std/ranges/concat/1.cc
@@ -85,10 +85,26 @@ test03()
   VERIFY( ranges::equal(view2, std::vector{4, 5, 6, 1, 2, 3}) );
 }
 
+void
+test04()
+{
+  // PR libstdc++/115215 - views::concat rejects non-movable reference
+  int x[] = {1,2,3};
+  struct nomove {
+nomove() = default;
+nomove(const nomove&) = delete;
+  };
+  auto v = x | views::transform([](int) { return nomove{}; });
+  using type = decltype(views::concat(v));
+  using type = decltype(v);
+}
+
 int
 main()
 {
   static_assert(test01());
+  test01();
   test02();
   test03();
+  test04();
 }
-- 
2.48.1.356.g0394451348.dirty



Re: [PATCH] RISC-V: Bugfix ICE for RVV intrinisc when using no-extension parameters

2025-02-17 Thread Jeff Law




On 2/15/25 10:40 PM, Jin Ma wrote:

On Sun, 16 Feb 2025 11:59:37 +0800, Jeff Law wrote:



On 2/14/25 12:12 AM, Jin Ma wrote:

When using riscv_v_abi, the return and arguments of the function should
be adequately checked to avoid ICE.

PR target/118872

gcc/ChangeLog:

	* config/riscv/riscv.cc (riscv_fntype_abi): Strengthen the 
	of the check to avoid missing the error report.


gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr118872.c: New test.

Note this is causing regressions in the pre-commit CI system:



https://github.com/ewlu/gcc-precommit-ci/issues/3096#issuecomment-2659969415



Can you please take care of those regressions.  Thanks.


I sincerely apologize for the issues caused. I will work on resolving them. 
Before
submitting, I only performed regression tests locally for rv64gc-lp64d and 
rv32gc-ilp32d,
and it seems that was insufficient.

No worries.  It happens to all of us at some point.

For a RISC-V specific patch what I would recommend testing locally would 
be a configuration of interest to you/Alibaba.  It sounds like you're 
already doing that, which is fantastic.


So what I would do in the future is just wait for the pre-commit CI 
system to run before actually pushing a commit to the trunk, even if the 
patch has been approved.



You can monitor status of any given patch in the CI system you have 
submitted via this URL:




https://patchwork.sourceware.org/project/gcc/list/?series=&submitter=Jin+Ma&state=*&q=&archive=&delegate=


You can further refine the search if you're so inclined.





Additionally, I've noticed that when I run regression tests on the master 
branch using
rv64gcv-lp64d or rv32gcv-ilp32d without applying any patches, there are still 
over 300
failures related to RVV. I'm unsure if there is an issue with my testing 
approach.
That sounds about right (sadly).  What's more important than the number 
of failures is whether or not the patch has *new* failures.  ie, test 
before your patch, apply the patch, test after your patch and compare 
the results.  There are scripts in the contrib subdirectory that will 
compare the summary files.




Jeff


Re: [PATCH] RISC-V: Fix failed tests for regression due to fix ICE patch

2025-02-17 Thread Jeff Law




On 2/16/25 7:51 PM, Jin Ma wrote:

Ref:
https://github.com/ewlu/gcc-precommit-ci/issues/3096#issue-2854419069

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/bug-9.c: Added new failure check.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-17.c: 
Likewise.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-18.c: 
Likewise.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-19.c: 
Likewise.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-20.c: 
Likewise.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-21.c: 
Likewise.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-22.c: 
Likewise.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-23.c: 
Likewise.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-24.c: 
Likewise.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-25.c: 
Likewise.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-26.c: 
Likewise.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-27.c: 
Likewise.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-28.c: 
Likewise.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-29.c: 
Likewise.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-3.c: 
Likewise.

OK.  And just to confirm, the pre-commit CI is happy with this patch ;-)

jeff



Re: [PATCH v1] RISC-V: Fix ICE for target attributes has different xlen size

2025-02-17 Thread Jeff Law




On 2/14/25 11:33 PM, pan2...@intel.com wrote:

From: Pan Li 

This patch would like to avoid the ICE when the target attribute
specific the xlen different to the cmd.  Aka compile with rv64gc
but target attribute with rv32gcv_zbb.  For example as blow:

1   │ long foo (long a, long b)
2   │ __attribute__((target("arch=rv32gcv_zbb")));
3   │
4   │ long foo (long a, long b)
5   │ {
6   │   return a + (b * 2);
7   │ }

when compile with rv64gc -O3, it will have ICE similar as below

during RTL pass: fwprop1
test.c: In function ‘foo’:
test.c:10:1: internal compiler error: in add_use, at
rtl-ssa/accesses.cc:1234
10 | }
   | ^
0x44d6b9d internal_error(char const*, ...)
 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/diagnostic-global-context.cc:517
0x44a26a6 fancy_abort(char const*, int, char const*)
 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/diagnostic.cc:1722
0x408fac9 rtl_ssa::function_info::add_use(rtl_ssa::use_info*)
 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/rtl-ssa/accesses.cc:1234
0x40a5eea
rtl_ssa::function_info::create_reg_use(rtl_ssa::function_info::build_info&,
rtl_ssa::insn_info*, rtl_ssa::resource_info)
 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/rtl-ssa/insns.cc:496
0x4456738
rtl_ssa::function_info::add_artificial_accesses(rtl_ssa::function_info::build_info&,
df_ref_flags)
 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/rtl-ssa/blocks.cc:900
0x4457297
rtl_ssa::function_info::start_block(rtl_ssa::function_info::build_info&,
rtl_ssa::bb_info*)
 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/rtl-ssa/blocks.cc:1082
0x4453627
rtl_ssa::function_info::bb_walker::before_dom_children(basic_block_def*)
 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/rtl-ssa/blocks.cc:118
0x3e9f3fb dom_walker::walk(basic_block_def*)
 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/domwalk.cc:311
0x445806f rtl_ssa::function_info::process_all_blocks()
 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/rtl-ssa/blocks.cc:1298
0x40a22d3 rtl_ssa::function_info::function_info(function*)
 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/rtl-ssa/functions.cc:51
0x3ec3f80 fwprop_init
 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/fwprop.cc:893
0x3ec420d fwprop
 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/fwprop.cc:963
0x3ec43ad execute

Consider stage 4, we just report error for the above scenario when
detect the cmd xlen is different to the target attribute during the
target hook TARGET_OPTION_VALID_ATTRIBUTE_P implementation.

PR target/118540

gcc/ChangeLog:

* config/riscv/riscv-target-attr.cc 
(riscv_target_attr_parser::parse_arch):
Report error when cmd xlen is different with target attribute.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr118540-1.c: New test.
* gcc.target/riscv/rvv/base/pr118540-2.c: New test.

Just whitespace nits caught be the linter.



  {
+  if (TARGET_64BIT && strncmp ("32", str + 2, strlen("32")) == 0)

Space betwen the strlen and the open paren.



+  if (!TARGET_64BIT && strncmp ("64", str + 2, strlen("64")) == 0)

Similarly.

I think the other linter warning is inside the diagnostic string and a 
false positive.


So OK with the two whitespace fixes.

jeff



Re: [PATCH] RISC-V: Bugfix ICE for RVV intrinisc when using no-extension parameters

2025-02-17 Thread Jin Ma
On Mon, 17 Feb 2025 22:49:47 -0700, Jeff Law wrote:
> 
> 
> On 2/15/25 10:40 PM, Jin Ma wrote:
> > On Sun, 16 Feb 2025 11:59:37 +0800, Jeff Law wrote:
> >>
> >>
> >> On 2/14/25 12:12 AM, Jin Ma wrote:
> >>> When using riscv_v_abi, the return and arguments of the function should
> >>> be adequately checked to avoid ICE.
> >>>
> >>>   PR target/118872
> >>>
> >>> gcc/ChangeLog:
> >>>
> >>>   * config/riscv/riscv.cc (riscv_fntype_abi): Strengthen the 
> >>>   of the check to avoid missing the error report.
> >>>
> >>> gcc/testsuite/ChangeLog:
> >>>
> >>>   * gcc.target/riscv/rvv/base/pr118872.c: New test.
> >> Note this is causing regressions in the pre-commit CI system:
> >>
> >>
> >>> https://github.com/ewlu/gcc-precommit-ci/issues/3096#issuecomment-2659969415
> >>
> >>
> >> Can you please take care of those regressions.  Thanks.
> > 
> > I sincerely apologize for the issues caused. I will work on resolving them. 
> > Before
> > submitting, I only performed regression tests locally for rv64gc-lp64d and 
> > rv32gc-ilp32d,
> > and it seems that was insufficient.
> No worries.  It happens to all of us at some point.
> 
> For a RISC-V specific patch what I would recommend testing locally would 
> be a configuration of interest to you/Alibaba.  It sounds like you're 
> already doing that, which is fantastic.
> 
> So what I would do in the future is just wait for the pre-commit CI 
> system to run before actually pushing a commit to the trunk, even if the 
> patch has been approved.
> 
> 
> You can monitor status of any given patch in the CI system you have 
> submitted via this URL:
> 
> 
> > https://patchwork.sourceware.org/project/gcc/list/?series=&submitter=Jin+Ma&state=*&q=&archive=&delegate=
> 
> You can further refine the search if you're so inclined.

Thank you very much for your response and guidance; it has been very helpful to 
me.

> 
> > 
> > Additionally, I've noticed that when I run regression tests on the master 
> > branch using
> > rv64gcv-lp64d or rv32gcv-ilp32d without applying any patches, there are 
> > still over 300
> > failures related to RVV. I'm unsure if there is an issue with my testing 
> > approach.
> That sounds about right (sadly).  What's more important than the number 
> of failures is whether or not the patch has *new* failures.  ie, test 
> before your patch, apply the patch, test after your patch and compare 
> the results.  There are scripts in the contrib subdirectory that will 
> compare the summary files.

Yes, I see. Thanks again :) 

Best regards,
Jin Ma

> 
> Jeff



RE: [PATCH v1] RISC-V: Fix ICE for target attributes has different xlen size

2025-02-17 Thread Li, Pan2
> So OK with the two whitespace fixes.

Thanks Jeff, will commit with the whitespace fixes.

Pan

-Original Message-
From: Jeff Law  
Sent: Tuesday, February 18, 2025 2:00 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; rdapp@gmail.com
Subject: Re: [PATCH v1] RISC-V: Fix ICE for target attributes has different 
xlen size



On 2/14/25 11:33 PM, pan2...@intel.com wrote:
> From: Pan Li 
> 
> This patch would like to avoid the ICE when the target attribute
> specific the xlen different to the cmd.  Aka compile with rv64gc
> but target attribute with rv32gcv_zbb.  For example as blow:
> 
> 1   │ long foo (long a, long b)
> 2   │ __attribute__((target("arch=rv32gcv_zbb")));
> 3   │
> 4   │ long foo (long a, long b)
> 5   │ {
> 6   │   return a + (b * 2);
> 7   │ }
> 
> when compile with rv64gc -O3, it will have ICE similar as below
> 
> during RTL pass: fwprop1
> test.c: In function ‘foo’:
> test.c:10:1: internal compiler error: in add_use, at
> rtl-ssa/accesses.cc:1234
> 10 | }
>| ^
> 0x44d6b9d internal_error(char const*, ...)
>  
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/diagnostic-global-context.cc:517
> 0x44a26a6 fancy_abort(char const*, int, char const*)
>  
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/diagnostic.cc:1722
> 0x408fac9 rtl_ssa::function_info::add_use(rtl_ssa::use_info*)
>  
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/rtl-ssa/accesses.cc:1234
> 0x40a5eea
> rtl_ssa::function_info::create_reg_use(rtl_ssa::function_info::build_info&,
> rtl_ssa::insn_info*, rtl_ssa::resource_info)
>  
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/rtl-ssa/insns.cc:496
> 0x4456738
> rtl_ssa::function_info::add_artificial_accesses(rtl_ssa::function_info::build_info&,
> df_ref_flags)
>  
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/rtl-ssa/blocks.cc:900
> 0x4457297
> rtl_ssa::function_info::start_block(rtl_ssa::function_info::build_info&,
> rtl_ssa::bb_info*)
>  
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/rtl-ssa/blocks.cc:1082
> 0x4453627
> rtl_ssa::function_info::bb_walker::before_dom_children(basic_block_def*)
>  
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/rtl-ssa/blocks.cc:118
> 0x3e9f3fb dom_walker::walk(basic_block_def*)
>  
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/domwalk.cc:311
> 0x445806f rtl_ssa::function_info::process_all_blocks()
>  
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/rtl-ssa/blocks.cc:1298
> 0x40a22d3 rtl_ssa::function_info::function_info(function*)
>  
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/rtl-ssa/functions.cc:51
> 0x3ec3f80 fwprop_init
>  
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/fwprop.cc:893
> 0x3ec420d fwprop
>  
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/fwprop.cc:963
> 0x3ec43ad execute
> 
> Consider stage 4, we just report error for the above scenario when
> detect the cmd xlen is different to the target attribute during the
> target hook TARGET_OPTION_VALID_ATTRIBUTE_P implementation.
> 
>   PR target/118540
> 
> gcc/ChangeLog:
> 
>   * config/riscv/riscv-target-attr.cc 
> (riscv_target_attr_parser::parse_arch):
>   Report error when cmd xlen is different with target attribute.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/riscv/rvv/base/pr118540-1.c: New test.
>   * gcc.target/riscv/rvv/base/pr118540-2.c: New test.
Just whitespace nits caught be the linter.


>   {
> +  if (TARGET_64BIT && strncmp ("32", str + 2, strlen("32")) == 0)
Space betwen the strlen and the open paren.


> +  if (!TARGET_64BIT && strncmp ("64", str + 2, strlen("64")) == 0)
Similarly.

I think the other linter warning is inside the diagnostic string and a 
false positive.

So OK with the two whitespace fixes.

jeff



[PATCH] i386: Re-order i386.opt.urls

2025-02-17 Thread Haochen Jiang
(Seems patch not sent out, resending)

Hi all,

The order of i386.opt.urls need to be the same as i386.opt according to
auto builder. I thought the urls file is a dict but actually not.

Commit as obvious.

Thx,
Haochen

gcc/ChangeLog:

* config/i386/i386.opt.urls: Adjust the order for avx10.2
and avx10.2-512 due to their order change in i386.opt.
---
 gcc/config/i386/i386.opt.urls | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/config/i386/i386.opt.urls b/gcc/config/i386/i386.opt.urls
index 5cb304d2a73..ee6806169df 100644
--- a/gcc/config/i386/i386.opt.urls
+++ b/gcc/config/i386/i386.opt.urls
@@ -605,12 +605,12 @@ UrlSuffix(gcc/x86-Options.html#index-mavx10_002e1-512)
 mavx10.2-256
 UrlSuffix(gcc/x86-Options.html#index-mavx10_002e2-256)
 
-mavx10.2-512
-UrlSuffix(gcc/x86-Options.html#index-mavx10_002e2-512)
-
 mavx10.2
 UrlSuffix(gcc/x86-Options.html#index-mavx10_002e2)
 
+mavx10.2-512
+UrlSuffix(gcc/x86-Options.html#index-mavx10_002e2-512)
+
 mamx-avx512
 UrlSuffix(gcc/x86-Options.html#index-mamx-avx512)
 
-- 
2.31.1



Re: [PATCH] tree-optimization/98845 - ICE with tail-merging and DCE/DSE disabled

2025-02-17 Thread Richard Biener
On Mon, 17 Feb 2025, Richard Biener wrote:

> The following shows that tail-merging will make dead SSA defs live
> in paths where it wasn't before, possibly introducing UB or as
> in this case, uses of abnormals that eventually fail coalescing
> later.  The fix is to register such defs for stmt comparison.
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Pushed with the following additional testcase adjustment.

diff --git a/gcc/testsuite/gcc.dg/pr81192.c b/gcc/testsuite/gcc.dg/pr81192.c
index c46ac18fd9a..87a7a7a19c8 100644
--- a/gcc/testsuite/gcc.dg/pr81192.c
+++ b/gcc/testsuite/gcc.dg/pr81192.c
@@ -25,12 +25,16 @@ void __GIMPLE(ssa, startwith("pre")) fn2   ()
   if (j_6(D) != _Literal (int)2147483647)
 goto __BB4;
   else
-goto __BB5;
+goto __BB9;
 
   __BB(4):
   iftmp2_8 = j_6(D) + _Literal (int)1;
   goto __BB5;
 
+  __BB(9):
+  iftmp2_8 = j_6(D) + _Literal (int)1;
+  goto __BB5;
+
   __BB(5):
   b_lsm6_10 = _Literal (int)2147483647;
   goto __BB6;


>   PR tree-optimization/98845
>   * tree-ssa-tail-merge.cc (stmt_local_def): Consider a
>   def with no uses not local.
> 
>   * gcc.dg/pr98845.c: New testcase.
> ---
>  gcc/testsuite/gcc.dg/pr98845.c | 33 +
>  gcc/tree-ssa-tail-merge.cc |  8 
>  2 files changed, 41 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/pr98845.c
> 
> diff --git a/gcc/testsuite/gcc.dg/pr98845.c b/gcc/testsuite/gcc.dg/pr98845.c
> new file mode 100644
> index 000..074c979678f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr98845.c
> @@ -0,0 +1,33 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fno-tree-dce -fno-tree-dse" } */
> +
> +int n;
> +
> +__attribute__ ((returns_twice)) void
> +foo (void);
> +
> +void
> +bar (void);
> +
> +void
> +quux (int x)
> +{
> +  if (x)
> +++x;
> +  else
> +{
> +  if (n)
> +{
> +  x = 1;
> +  foo ();
> +}
> +  else
> +bar ();
> +
> +  if (n)
> +{
> +  ++x;
> +  ++n;
> +}
> +}
> +}
> diff --git a/gcc/tree-ssa-tail-merge.cc b/gcc/tree-ssa-tail-merge.cc
> index d897970079c..857e91c206b 100644
> --- a/gcc/tree-ssa-tail-merge.cc
> +++ b/gcc/tree-ssa-tail-merge.cc
> @@ -336,10 +336,13 @@ stmt_local_def (gimple *stmt)
>  
>def_bb = gimple_bb (stmt);
>  
> +  bool any_use = false;
>FOR_EACH_IMM_USE_FAST (use_p, iter, val)
>  {
>if (is_gimple_debug (USE_STMT (use_p)))
>   continue;
> +
> +  any_use = true;
>bb = gimple_bb (USE_STMT (use_p));
>if (bb == def_bb)
>   continue;
> @@ -351,6 +354,11 @@ stmt_local_def (gimple *stmt)
>return false;
>  }
>  
> +  /* When there is no use avoid making the stmt live on other paths.
> + This can happen with DCE disabled or not done as seen in PR98845.  */
> +  if (!any_use)
> +return false;
> +
>return true;
>  }
>  
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: [PATCH 1/1] AArch64: Fold builtins with highpart args to highpart equivalent [PR117850]

2025-02-17 Thread Kyrylo Tkachov
Hi Spencer,

> On 17 Feb 2025, at 20:07, Spencer Abson  wrote:
> 
> Add a fold at gimple_fold_builtin to prefer the highpart variant of a builtin
> if the arguments are better suited to it. This helps us avoid copying data
> between lanes before operation.
> 
> E.g. We prefer to use UMULL2 rather than DUP+UMULL for the following:
> 
> uint16x8_t
> foo(const uint8x16_t s) {
> const uint8x16_t f0 = vdupq_n_u8(4);
> return vmull_u8(vget_high_u8(s), vget_high_u8(f0));
> }
> 

Yeah, this looks like a great approach. It doesn’t conflict with my previous 
work in the area as I was just removing uses of UNSPEC in the backend and 
replacing them with
organic RTL expressions that happened to allow for more transformations by the 
RTL passes. Doing this folding in GIMPLE is a good orthogonal improvement.



> gcc/ChangeLog:
> 
> * config/aarch64/aarch64-builtins.cc (LO_HI_PAIRINGS): New macro.
> Covers every LO_HI_PAIR.
> (aarch64_get_highpart_builtin): New function. Get the highpart builtin
> paired with the input FCODE.
> (LO_HI_PAIR):
> (aarch64_self_concat_vec_cst): New function. Concatenate a
> VECTOR_CST with itself.
> (aarch64_object_of_bfr): New function. Helper to check arguments
> for vector highparts.
> (aarch64_fold_lo_call_to_hi): New function.
> (aarch64_general_gimple_fold_builtin): Add cases for the lowpart
> builtins.
> * config/aarch64/aarch64-builtin-pairs.def: New file. Declare
> pairings of lowpart/highpart builtins.
> 
> gcc/testsuite/ChangeLog:
> * gcc.target/aarch64/simd/vabal_combine.c: Test changed to
> pass after earlier builtin fold.
> * gcc.target/aarch64/simd/fold_to_highpart_1.c: New test.
> * gcc.target/aarch64/simd/fold_to_highpart_2.c: New test.
> * gcc.target/aarch64/simd/fold_to_highpart_3.c: New test.
> * gcc.target/aarch64/simd/fold_to_highpart_4.c: New test.
> * gcc.target/aarch64/simd/fold_to_highpart_5.c: New test.
> ---
> gcc/config/aarch64/aarch64-builtin-pairs.def  |  77 ++
> gcc/config/aarch64/aarch64-builtins.cc| 232 ++
> .../aarch64/simd/fold_to_highpart_1.c | 708 ++
> .../aarch64/simd/fold_to_highpart_2.c |  82 ++
> .../aarch64/simd/fold_to_highpart_3.c |  80 ++
> .../aarch64/simd/fold_to_highpart_4.c |  77 ++
> .../aarch64/simd/fold_to_highpart_5.c |  71 ++
> .../gcc.target/aarch64/simd/vabal_combine.c   |  12 +-
> 8 files changed, 1333 insertions(+), 6 deletions(-)
> create mode 100644 gcc/config/aarch64/aarch64-builtin-pairs.def
> create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/fold_to_highpart_1.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/fold_to_highpart_2.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/fold_to_highpart_3.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/fold_to_highpart_4.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/fold_to_highpart_5.c
> 
> diff --git a/gcc/config/aarch64/aarch64-builtin-pairs.def 
> b/gcc/config/aarch64/aarch64-builtin-pairs.def
> new file mode 100644
> index 000..d3ca69a1887
> --- /dev/null
> +++ b/gcc/config/aarch64/aarch64-builtin-pairs.def
> @@ -0,0 +1,77 @@
> +/* Pairings of AArch64 builtins that can be folded into each other.
> +   Copyright (C) 2025 Free Software Foundation, Inc.
> +
> +   This file is part of GCC.
> +
> +   GCC is free software; you can redistribute it and/or modify it
> +   under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3, or (at your option)
> +   any later version.
> +
> +   GCC is distributed in the hope that it will be useful, but
> +   WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with GCC; see the file COPYING3.  If not see
> +   .  */
> +
> +/* LO/HI widenable integer modes.  */
> +#define LO_HI_PAIR_V_WI(T, LO, HI) \
> +  LO_HI_PAIR (T##_##LO##v2si, T##_##HI##v4si) \
> +  LO_HI_PAIR (T##_##LO##v4hi, T##_##HI##v8hi) \
> +  LO_HI_PAIR (T##_##LO##v8qi, T##_##HI##v16qi)
> +
> +/* LO/HI Single/Half integer modes.  */
> +#define LO_HI_PAIR_V_HSI(T, LO, HI) \
> +  LO_HI_PAIR (T##_##LO##v2si, T##_##HI##v4si) \
> +  LO_HI_PAIR (T##_##LO##v4hi, T##_##HI##v8hi)
> +
> +#define UNOP_LONG_LH_PAIRS \
> +  LO_HI_PAIR (UNOP_sxtlv8hi,  UNOP_vec_unpacks_hi_v16qi) \
> +  LO_HI_PAIR (UNOP_sxtlv4si,  UNOP_vec_unpacks_hi_v8hi) \
> +  LO_HI_PAIR (UNOP_sxtlv2di,  UNOP_vec_unpacks_hi_v4si) \
> +  LO_HI_PAIR (UNOPU_uxtlv8hi, UNOPU_vec_unpacku_hi_v16qi) \
> +  LO_HI_PAIR (UNOPU_uxtlv4si, UNOPU_vec_unpacku_hi_v8hi) \
> +  LO_HI_PAIR (UNOPU_uxtlv2di, UNOPU_vec_unpacku_hi_v4si)
> +
> +#define BINOP_LONG_LH_PAIRS \
> +  LO_HI_PAIR_V_WI (BINOP,  saddl, saddl2) \
> +  LO_HI_PAIR_V_WI (BINOPU, uaddl, uaddl2) \
> +  LO_HI_PAIR_V_WI (BINOP,  ssubl, ssubl2) \
> +  LO_HI_PAIR_V_WI

[PATCH 1/1] AArch64: Fold builtins with highpart args to highpart equivalent [PR117850]

2025-02-17 Thread Spencer Abson
Add a fold at gimple_fold_builtin to prefer the highpart variant of a builtin
if the arguments are better suited to it. This helps us avoid copying data
between lanes before operation.

E.g. We prefer to use UMULL2 rather than DUP+UMULL for the following:

uint16x8_t
foo(const uint8x16_t s) {
const uint8x16_t f0 = vdupq_n_u8(4);
return vmull_u8(vget_high_u8(s), vget_high_u8(f0));
}

gcc/ChangeLog:

* config/aarch64/aarch64-builtins.cc (LO_HI_PAIRINGS): New macro.
Covers every LO_HI_PAIR.
(aarch64_get_highpart_builtin): New function. Get the highpart builtin
paired with the input FCODE.
(LO_HI_PAIR):
(aarch64_self_concat_vec_cst): New function. Concatenate a
VECTOR_CST with itself.
(aarch64_object_of_bfr): New function. Helper to check arguments
for vector highparts.
(aarch64_fold_lo_call_to_hi): New function.
(aarch64_general_gimple_fold_builtin): Add cases for the lowpart
builtins.
* config/aarch64/aarch64-builtin-pairs.def: New file. Declare
pairings of lowpart/highpart builtins.

gcc/testsuite/ChangeLog:
* gcc.target/aarch64/simd/vabal_combine.c: Test changed to
pass after earlier builtin fold.
* gcc.target/aarch64/simd/fold_to_highpart_1.c: New test.
* gcc.target/aarch64/simd/fold_to_highpart_2.c: New test.
* gcc.target/aarch64/simd/fold_to_highpart_3.c: New test.
* gcc.target/aarch64/simd/fold_to_highpart_4.c: New test.
* gcc.target/aarch64/simd/fold_to_highpart_5.c: New test.
---
 gcc/config/aarch64/aarch64-builtin-pairs.def  |  77 ++
 gcc/config/aarch64/aarch64-builtins.cc| 232 ++
 .../aarch64/simd/fold_to_highpart_1.c | 708 ++
 .../aarch64/simd/fold_to_highpart_2.c |  82 ++
 .../aarch64/simd/fold_to_highpart_3.c |  80 ++
 .../aarch64/simd/fold_to_highpart_4.c |  77 ++
 .../aarch64/simd/fold_to_highpart_5.c |  71 ++
 .../gcc.target/aarch64/simd/vabal_combine.c   |  12 +-
 8 files changed, 1333 insertions(+), 6 deletions(-)
 create mode 100644 gcc/config/aarch64/aarch64-builtin-pairs.def
 create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/fold_to_highpart_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/fold_to_highpart_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/fold_to_highpart_3.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/fold_to_highpart_4.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/fold_to_highpart_5.c

diff --git a/gcc/config/aarch64/aarch64-builtin-pairs.def 
b/gcc/config/aarch64/aarch64-builtin-pairs.def
new file mode 100644
index 000..d3ca69a1887
--- /dev/null
+++ b/gcc/config/aarch64/aarch64-builtin-pairs.def
@@ -0,0 +1,77 @@
+/* Pairings of AArch64 builtins that can be folded into each other.
+   Copyright (C) 2025 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   .  */
+
+/* LO/HI widenable integer modes.  */
+#define LO_HI_PAIR_V_WI(T, LO, HI) \
+  LO_HI_PAIR (T##_##LO##v2si, T##_##HI##v4si) \
+  LO_HI_PAIR (T##_##LO##v4hi, T##_##HI##v8hi) \
+  LO_HI_PAIR (T##_##LO##v8qi, T##_##HI##v16qi)
+
+/* LO/HI Single/Half integer modes.  */
+#define LO_HI_PAIR_V_HSI(T, LO, HI) \
+  LO_HI_PAIR (T##_##LO##v2si, T##_##HI##v4si) \
+  LO_HI_PAIR (T##_##LO##v4hi, T##_##HI##v8hi)
+
+#define UNOP_LONG_LH_PAIRS \
+  LO_HI_PAIR (UNOP_sxtlv8hi,  UNOP_vec_unpacks_hi_v16qi) \
+  LO_HI_PAIR (UNOP_sxtlv4si,  UNOP_vec_unpacks_hi_v8hi) \
+  LO_HI_PAIR (UNOP_sxtlv2di,  UNOP_vec_unpacks_hi_v4si) \
+  LO_HI_PAIR (UNOPU_uxtlv8hi, UNOPU_vec_unpacku_hi_v16qi) \
+  LO_HI_PAIR (UNOPU_uxtlv4si, UNOPU_vec_unpacku_hi_v8hi) \
+  LO_HI_PAIR (UNOPU_uxtlv2di, UNOPU_vec_unpacku_hi_v4si)
+
+#define BINOP_LONG_LH_PAIRS \
+  LO_HI_PAIR_V_WI (BINOP,  saddl, saddl2) \
+  LO_HI_PAIR_V_WI (BINOPU, uaddl, uaddl2) \
+  LO_HI_PAIR_V_WI (BINOP,  ssubl, ssubl2) \
+  LO_HI_PAIR_V_WI (BINOPU, usubl, usubl2) \
+  LO_HI_PAIR_V_WI (BINOP,  sabdl, sabdl2) \
+  LO_HI_PAIR_V_WI (BINOPU, uabdl, uabdl2) \
+  LO_HI_PAIR_V_WI (BINOP,  intrinsic_vec_smult_lo_, vec_widen_smult_hi_) \
+  LO_HI_PAIR_V_WI (BINOPU, intrinsic_vec_umult_lo_, vec_widen_umult_hi_) \
+  LO_HI_PAIR_V_HSI (BINOP,  sqdmull, sqdmull2)
+
+#define BINOP_LONG_N_LH_PAIRS \
+  LO_HI_PAIR_V_HSI (BINOP,  smull_n, smull_hi_n) \
+  

Re: [PATCH] COBOL 6/15 156K lex: lexer

2025-02-17 Thread David Malcolm
On Mon, 2025-02-17 at 12:29 -0500, James K. Lowden wrote:
> On Sat, 15 Feb 2025 23:32:37 -0500
> David Malcolm  wrote:
> 
> In defense of lack of free(3) ...
> 
> > > +const char *
> > > +esc( size_t len, const char input[] ) {
> > > +  static char spaces[] = "([,;]?[[:space:]])+";
> > > +  static char spaceD[] = "(\n {6}D" "|" "[,;]?[[:space:]])+";
> > > +  static char buffer[64 * 1024];
> > > +  char *p = buffer;
> > > +  const char *eoinput = input + len;
> > > +
> > > +  const char *spacex = is_reference_format()? spaceD : spaces;
> > > +
> > > +  for( const char *s=input; *s && s < eoinput; s++ ) {
> > > +    *p = '\0';
> > > +    gcc_assert( size_t(p - buffer) < sizeof(buffer) - 4 );
> 
> overflow guarded here

...but only via gcc_assert.

gcc/system.h has:

/* Use gcc_assert(EXPR) to test invariants.  */
#if ENABLE_ASSERT_CHECKING
#define gcc_assert(EXPR)\
   ((void)(!(EXPR) ? fancy_abort (__FILE__, __LINE__, __FUNCTION__), 0 : 0))
#elif (GCC_VERSION >= 4005)
#define gcc_assert(EXPR)\
  ((void)(UNLIKELY (!(EXPR)) ? __builtin_unreachable (), 0 : 0))
#else
/* Include EXPR, so that unused variable warnings do not occur.  */
#define gcc_assert(EXPR) ((void)(0 && (EXPR)))
#endif

so if ENABLE_ASSERT_CHECKING is false, that check gets preprocessed
away.

It turns out that (in gcc/configure.ac) that with --enable-
checking=release (which is soon to become the default for gcc 15), that
ENABLE_ASSERT_CHECKING is still true, but it is possible to configure
gcc with it false (e.g. with --enable-checking=none).

We shouldn't rely on assert to do checking of user-controllable input;
it should always be checked.

> 
> > > +    switch(*s) {
> > > +    case '^': case '$':
> > > +    case '(': case ')':
> > > +    case '*': case '+': case '?':
> > > +    case '[': case ']':
> > > +    case '{': case '}':
> > > +    case '|':
> > > +    case '.':
> > > +  *p++ = '\\';
> > > +  *p++ = *s;
> > > +  break;
> > > +    case '\\':
> > > +  *p++ = '[';
> > > +  *p++ = *s;
> > > +  *p++ = ']';
> > > +  break;
> > > +
> > > +    case ';': case ',':
> > > +  if( ! (s+1 < eoinput && s[1] == 0x20) ) {
> > > +    *p++ = *s;
> > > +    break;
> > > +  }
> > > +  __attribute__((fallthrough));
> > > +    case 0x20: case '\n':
> > > +  gcc_assert(p + sizeof(spacex) < buffer + sizeof(buffer));
> 
> and overflow guarded here, the only place where more than 4
> characters can be inserted into the buffer.  

Likewise.
> 
> > > +  p = stpcpy( p, spacex );
> > > +  while( s+1 < eoinput && is_separator_space(s+1)) {
> > > +    s++;
> > > +  }
> > > +  break;
> > > +    default:
> > > +  *p++ = *s;
> > > +  break;
> > > +    }
> > > +  }
> > > +  *p = '\0';
> ...
> > > +  return xstrdup(buffer);
> > > +}
> > 
> > Has a fixed size 64k buffer; doesn't seem to have proper overflow
> > handling. Could use a pretty_printer to accumulate chars.
> 
> Thank you for these comments.  Let me see if I can alleviate your
> concerns.  
> 
> This function is called from exactly one place, where the file-
> reader, lexio, parses a REPLACE directive.  The COBOL input says
> "REPLACE X BY Y" subject to some constraints.  Because we're using
> regex to find X, and because X might be any arbitrary string, the
> esc() function escapes regex metacharacters prior to executing the
> regex.  
> 
> IMHO it's unlikely the resulting regex input will exceed 64 KB.  It's
> unlikely to be even 64 bytes.  (Usually X is a COBOL identifier,
> limited to 64 bytes by ISO.)  Granted, a fixed maximum is a
> limitation.  But I put it to you: which is more likely?  For a regex
> to exceed 64 KB, or for heap allocation to fail to return adequate
> memory?  If there's some crazy input, I'd rather die at 64 KB than
> consume gigabytes of swap on the way to crashing.  

If you're going to have it die at 64KB, please make sure it *does* die
(hence my concerns about using gcc_assert for the bounds checking).

Hypothetical scenario: sysadmin sets up a service that accepts
arbitrary COBOL code to be compiled (such as Compiler Explorer).  Can
an attacker construct a not-necessarily valid input COBOL program that
overruns the on-stack buffer?  If so, they can probably use this to
inject code into the process, at which point they've got a beachhead to
attack the system further.  (Though IIRC Compiler Explorer already
sandboxes the compilers that are running there)

[...]

Dave



[PATCH] builtins: Ensure sin and cos properly set errno when INFINITY is passed [PR80042]

2025-02-17 Thread Peter Damianov
POSIX says that sin and cos should set errno to EDOM when infinity is passed to
them. Make sure this is accounted for in builtins.def, and add tests.

gcc/
PR middle-end/80042
* builtins.def: (sin|cos)(f|l) can set errno.
gcc/testsuite/
* gcc.dg/pr80042.c: New testcase.
---
 gcc/builtins.def   | 20 +-
 gcc/testsuite/gcc.dg/pr80042.c | 71 ++
 2 files changed, 82 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr80042.c

diff --git a/gcc/builtins.def b/gcc/builtins.def
index 89fc74654ca..86d416f95d4 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -288,6 +288,8 @@ along with GCC; see the file COPYING3.  If not see
maintenance purposes.  */
 #undef ATTR_MATHFN_FPROUNDING_STORE
 #define ATTR_MATHFN_FPROUNDING_STORE ATTR_NOTHROW_LEAF_LIST
+#undef ATTR_MATHFN_FPROUNDING_ERRNO_STORE
+#define ATTR_MATHFN_FPROUNDING_ERRNO_STORE ATTR_NOTHROW_LEAF_LIST
 
 /* Define an attribute list for leaf functions that do not throw
exceptions normally, but may throw exceptions when using
@@ -350,8 +352,8 @@ DEF_C99_BUILTIN(BUILT_IN_COPYSIGNL, "copysignl", 
BT_FN_LONGDOUBLE_LONGDO
 #define COPYSIGN_TYPE(F) BT_FN_##F##_##F##_##F
 DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_COPYSIGN, "copysign", COPYSIGN_TYPE, 
ATTR_CONST_NOTHROW_LEAF_LIST)
 #undef COPYSIGN_TYPE
-DEF_LIB_BUILTIN(BUILT_IN_COS, "cos", BT_FN_DOUBLE_DOUBLE, 
ATTR_MATHFN_FPROUNDING)
-DEF_C99_C90RES_BUILTIN (BUILT_IN_COSF, "cosf", BT_FN_FLOAT_FLOAT, 
ATTR_MATHFN_FPROUNDING)
+DEF_LIB_BUILTIN(BUILT_IN_COS, "cos", BT_FN_DOUBLE_DOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
+DEF_C99_C90RES_BUILTIN (BUILT_IN_COSF, "cosf", BT_FN_FLOAT_FLOAT, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_LIB_BUILTIN(BUILT_IN_COSH, "cosh", BT_FN_DOUBLE_DOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_COSHF, "coshf", BT_FN_FLOAT_FLOAT, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_COSHL, "coshl", BT_FN_LONGDOUBLE_LONGDOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
@@ -674,18 +676,18 @@ DEF_EXT_LIB_BUILTIN(BUILT_IN_SIGNBITD128, 
"signbitd128", BT_FN_INT_DFLOAT128
 DEF_EXT_LIB_BUILTIN(BUILT_IN_SIGNIFICAND, "significand", 
BT_FN_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_EXT_LIB_BUILTIN(BUILT_IN_SIGNIFICANDF, "significandf", 
BT_FN_FLOAT_FLOAT, ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_EXT_LIB_BUILTIN(BUILT_IN_SIGNIFICANDL, "significandl", 
BT_FN_LONGDOUBLE_LONGDOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
-DEF_LIB_BUILTIN(BUILT_IN_SIN, "sin", BT_FN_DOUBLE_DOUBLE, 
ATTR_MATHFN_FPROUNDING)
-DEF_EXT_LIB_BUILTIN(BUILT_IN_SINCOS, "sincos", 
BT_FN_VOID_DOUBLE_DOUBLEPTR_DOUBLEPTR, ATTR_MATHFN_FPROUNDING_STORE)
-DEF_EXT_LIB_BUILTIN(BUILT_IN_SINCOSF, "sincosf", 
BT_FN_VOID_FLOAT_FLOATPTR_FLOATPTR, ATTR_MATHFN_FPROUNDING_STORE)
-DEF_EXT_LIB_BUILTIN(BUILT_IN_SINCOSL, "sincosl", 
BT_FN_VOID_LONGDOUBLE_LONGDOUBLEPTR_LONGDOUBLEPTR, ATTR_MATHFN_FPROUNDING_STORE)
-DEF_C99_C90RES_BUILTIN (BUILT_IN_SINF, "sinf", BT_FN_FLOAT_FLOAT, 
ATTR_MATHFN_FPROUNDING)
+DEF_LIB_BUILTIN(BUILT_IN_SIN, "sin", BT_FN_DOUBLE_DOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
+DEF_EXT_LIB_BUILTIN(BUILT_IN_SINCOS, "sincos", 
BT_FN_VOID_DOUBLE_DOUBLEPTR_DOUBLEPTR, ATTR_MATHFN_FPROUNDING_ERRNO_STORE)
+DEF_EXT_LIB_BUILTIN(BUILT_IN_SINCOSF, "sincosf", 
BT_FN_VOID_FLOAT_FLOATPTR_FLOATPTR, ATTR_MATHFN_FPROUNDING_ERRNO_STORE)
+DEF_EXT_LIB_BUILTIN(BUILT_IN_SINCOSL, "sincosl", 
BT_FN_VOID_LONGDOUBLE_LONGDOUBLEPTR_LONGDOUBLEPTR, 
ATTR_MATHFN_FPROUNDING_ERRNO_STORE)
+DEF_C99_C90RES_BUILTIN (BUILT_IN_SINF, "sinf", BT_FN_FLOAT_FLOAT, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_LIB_BUILTIN(BUILT_IN_SINH, "sinh", BT_FN_DOUBLE_DOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_SINHF, "sinhf", BT_FN_FLOAT_FLOAT, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_SINHL, "sinhl", BT_FN_LONGDOUBLE_LONGDOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 #define SINH_TYPE(F) BT_FN_##F##_##F
 DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_SINH, "sinh", SINH_TYPE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
-DEF_C99_C90RES_BUILTIN (BUILT_IN_SINL, "sinl", BT_FN_LONGDOUBLE_LONGDOUBLE, 
ATTR_MATHFN_FPROUNDING)
-DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_SIN, "sin", SINH_TYPE, 
ATTR_MATHFN_FPROUNDING)
+DEF_C99_C90RES_BUILTIN (BUILT_IN_SINL, "sinl", BT_FN_LONGDOUBLE_LONGDOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
+DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_SIN, "sin", SINH_TYPE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 #undef SINH_TYPE
 DEF_LIB_BUILTIN(BUILT_IN_SQRT, "sqrt", BT_FN_DOUBLE_DOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_SQRTF, "sqrtf", BT_FN_FLOAT_FLOAT, 
ATTR_MATHFN_FPROUNDING_ERRNO)
diff --git a/gcc/testsuite/gcc.dg/pr80042.c b/gcc/testsuite/gcc.dg/pr80042.c
new file mode 100644
index 000..cc578ae67e2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr80042.c
@@ -0,0 +1,71 @@
+/* dg-do run */
+/* dg-options "-O2 -lm" */
+
+#include 
+
+voi

Re: [PATCH] builtins: Ensure sin and cos properly set errno when INFINITY is passed [PR80042]

2025-02-17 Thread Sam James
Peter Damianov  writes:

> POSIX says that sin and cos should set errno to EDOM when infinity is passed 
> to
> them. Make sure this is accounted for in builtins.def, and add tests.
>
> gcc/
>   PR middle-end/80042
>   * builtins.def: (sin|cos)(f|l) can set errno.
> gcc/testsuite/
>   * gcc.dg/pr80042.c: New testcase.
> ---
>  gcc/builtins.def   | 20 +-
>  gcc/testsuite/gcc.dg/pr80042.c | 71 ++
>  2 files changed, 82 insertions(+), 9 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/pr80042.c
>
> [...]
> diff --git a/gcc/testsuite/gcc.dg/pr80042.c b/gcc/testsuite/gcc.dg/pr80042.c
> new file mode 100644
> index 000..cc578ae67e2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr80042.c
> @@ -0,0 +1,71 @@
> +/* dg-do run */
> +/* dg-options "-O2 -lm" */

These two lines are missing {}. Please double check the logs from your
testsuite run to make sure newly added/changed tests are executed (and
in the way you expect).

> [...]


[PATCH 1/2] libstdc++: Sync concat_view with final paper revision [PR115209]

2025-02-17 Thread Patrick Palka
Tested on x86_64-pc-linux-gnu, does this look OK  for trunk?

-- >8 --

The original implementation was accidentally based off of an older
revision of the paper, P2542R7 instead of R8.  As far as I can tell
the only semantic change in the final revision is the relaxed
constraints on the iterator's iter/sent operator- overloads.

The revision also simplifies the concat_view::end wording via C++26
pack indexing, which GCC 15 and Clang 19/20 implement so we can use
it unconditionally here and remove the __last_is_common helper trait.

PR libstdc++/115209

libstdc++-v3/ChangeLog:

* include/std/ranges (__detail::__last_is_common): Remove.
(__detail::__all_but_first_sized): New.
(concat_view::end): Use C++26 pack indexing instead of
__last_is_common as per P2542R8.
(concat_view::iterator::operator-): Update constraints on
iter/sent overloads as per P2542R7.
---
 libstdc++-v3/include/std/ranges | 38 ++---
 1 file changed, 16 insertions(+), 22 deletions(-)

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index 5c795a90fbc..22e0c9cae44 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -9683,12 +9683,8 @@ namespace ranges
&& __all_but_last_common<_Const, _Rs...>::value;
 
 template
-  struct __last_is_common
-  { static inline constexpr bool value = __last_is_common<_Rs...>::value; 
};
-
-template
-  struct __last_is_common<_Range>
-  { static inline constexpr bool value = common_range<_Range>; };
+  struct __all_but_first_sized
+  { static inline constexpr bool value = (sized_range<_Rs> && ...); };
   } // namespace __detail
 
   template
@@ -9726,13 +9722,11 @@ namespace ranges
 constexpr auto
 end() requires (!(__detail::__simple_view<_Vs> && ...))
 {
+  constexpr auto __n = sizeof...(_Vs);
   if constexpr ((semiregular> && ...)
-   && __detail::__last_is_common<_Vs...>::value)
-   {
- constexpr auto __n = sizeof...(_Vs);
- return iterator(this, in_place_index<__n - 1>,
-ranges::end(std::get<__n - 1>(_M_views)));
-   }
+   && common_range<_Vs...[__n - 1]>)
+   return iterator(this, in_place_index<__n - 1>,
+  ranges::end(std::get<__n - 1>(_M_views)));
   else
return default_sentinel;
 }
@@ -9740,13 +9734,11 @@ namespace ranges
 constexpr auto
 end() const requires (range && ...) && 
__detail::__concatable
 {
+  constexpr auto __n = sizeof...(_Vs);
   if constexpr ((semiregular> && ...)
-   && __detail::__last_is_common::value)
-   {
- constexpr auto __n = sizeof...(_Vs);
- return iterator(this, in_place_index<__n - 1>,
-   ranges::end(std::get<__n - 1>(_M_views)));
-   }
+   && common_range)
+   return iterator(this, in_place_index<__n - 1>,
+ ranges::end(std::get<__n - 1>(_M_views)));
   else
return default_sentinel;
 }
@@ -10128,8 +10120,9 @@ namespace ranges
 
 friend constexpr difference_type
 operator-(const iterator& __x, default_sentinel_t)
-  requires __detail::__concat_is_random_access<_Const, _Vs...>
-   && __detail::__last_is_common<__maybe_const_t<_Const, _Vs>...>::value
+  requires (sized_sentinel_for>,
+  iterator_t<__maybe_const_t<_Const, _Vs>>> && 
...)
+   && __detail::__all_but_first_sized<__maybe_const_t<_Const, 
_Vs>...>::value
 {
   return _S_invoke_with_runtime_index([&]() -> difference_type 
{
auto __dx = ranges::distance(std::get<_Ix>(__x._M_it),
@@ -10148,8 +10141,9 @@ namespace ranges
 
 friend constexpr difference_type
 operator-(default_sentinel_t, const iterator& __x)
-  requires __detail::__concat_is_random_access<_Const, _Vs...>
-   && __detail::__last_is_common<__maybe_const_t<_Const, _Vs>...>::value
+  requires (sized_sentinel_for>,
+  iterator_t<__maybe_const_t<_Const, _Vs>>> && 
...)
+   && __detail::__all_but_first_sized<__maybe_const_t<_Const, 
_Vs>...>::value
 { return -(__x - default_sentinel); }
 
 friend constexpr decltype(auto)
-- 
2.48.1.356.g0394451348.dirty



Re: [PATCH] COBOL 7/15 492K par: parser

2025-02-17 Thread Eric Gallager
On Mon, Feb 17, 2025 at 1:43 PM James K. Lowden
 wrote:
>
> On Sat, 15 Feb 2025 23:35:16 -0500
> David Malcolm  wrote:
>
> On better messages ...
>
> > +  if( ($$ & $2) == $2 ) {
> > +error_msg(@2, "%s clause repeated", clause);
> > +YYERROR;
> > +  }
> >
> > Obviously not needed for initial release, but it would be neat to have
> > a fix-it hint here that deletes the repeated token (fixit hints are
> > done via class rich_location, FWIW)
>
> Noted, thanks,
>
> > +  if( $data_clause == redefines_clause_e ) {
> > +error_msg(@2, "REDEFINES must appear "
> > + "immediately after LEVEL and NAME");
> > +YYERROR;
> > +  }
> >
> > A strict reading of our diagnostic guidelines suggests that all of
> > these keywords in these messages should be in quotes, via %{ and %}, or
> > via %qs.  But given that cobol has UPPERCASE KEYWORDS THAT ALREADY
> > REALLY STAND OUT, maybe that?s overkill.
>
> I endeavored to report every keyword in uppercase.  The user isn't required 
> to use uppercase; COBOL is (despite the official name, heh) case-insensitive. 
>  But the fact of the keyword is all we have.  The lexer doesn't capture how 
> the user typed it in; it reports only the presence of the token.
>
> In the case of user-defined names, the actual name supplied is captured and 
> reported literally.  So, the user could have e.g.
>
> 001-Initialization Section.
>
> where "Section" is a token whose input string is discarded, and
> "001-Initialization" is a user-defined name whose supplied form is preserved, 
> and reported verbatim.
>
> With that in mind, I propose a policy that builds on your observation (for 
> gcc-16, not today):  Report token names in uppercase, unquoted, and 
> user-defined names vebatim, quoted.
>
> I have that filed under "tasks for an eager volunteer, but probably me".  
> More tedious than difficult.
>
> > +  error_msg(@2, "%s is binary NUMERIC type, "
> > +   "incompatible with SIGN IS", field->name);
> >
> > Again, this isn?t needed for the initial release, but GCCs diagnostics
> > can have ?rules? associated with them, which can have URLs (see
> > diagnostic-metadata.h)  Is there a useful public standard for Cobol
> > with such rules that the output can link to?
>
> There is no freely available COBOL standard.  IBM and Microfocus (and others) 
> do publish their documentation on the  web, but the official standard is 
> copyrighted and comes at a price.  Please write to your congressman.
>

So, this might not help at the federal level at all, but at least here
at the state level in New Hampshire I've been trying to update our
code regarding open standards to push for greater use of them; please
see the following diff and let me know what further changes I could
make that would help force the relevant standards to be opened:
https://github.com/cooljeanius/legislation/blob/master/tech/21-R-mrg.htm.diff

> That said, it's been my ambition to tie every relevant message to the ISO 
> standard in force at time of compilation.  To that end, I want to move all 
> messages to a table keyed by ISO version and section number (or other, for 
> -dialect option).  The caller would refer to the table by the key, and 
> error_msg() et al. would report that information along with the message text.
>
> I don't know of another compiler that does that.  I don't mind showing them 
> how it's done!
>
> > +auto name = nice_name_of($inspected->field);
> > +if( !name[0] ) name = "its argument";
> > +error_msg(@inspected, "INSPECT cannot write to %s", 
> > name);
> >
> > Building up messages in fragments is a problem for i18n.  Better to
> > have an if/else guarding two separate calls to error_msg.  Use %qs in
> > the one that uses name, so the name appears in quotes.
>
> Understood, thanks.
>
> > +  if( $a.on_error && $a.not_error ) {
> > +error_msg(@b, "too many ON EXCEPTION clauses");
> >
> > Another thing not needed for the initial release - in general, if we?re
> > complaining about something in the code being incompatible with other
> > code we already saw, it?s good to issue a ?note? immediately after the
> > ?error?,
>
> Agreed, thanks.
>
> > Not needed for the initial release, but I see a lot of naked ?new?,
> > assigned to $$.  Could this be a std::unique_ptr, and use make_unique
> > rather than new? Similarly for the return type of functions like
> > new_reference and new_literal.  I suspect this would be a lot of work,
> > though, and may run into snags (does yylval have to a POD?), so no
> > obligation.
>
> You may have already seen my lengthy defense of not worrying about every 
> little string.  I would much rather waste 100 KB on unrecovered parser bits 
> and bobs than spend one afternoon 

[PATCH 0/1] AArch64: Fold builtin calls w/ highpart args to highpart equivalent [PR117850]

2025-02-17 Thread Spencer Abson
Hi all,

This patch implements the missed optimisation noted in PR117850.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117850

It covers all the AArch64 builtins that I can imagine this is sensible for,
excluding vshll/vshll_n (for now) due to a discrepancy in their declarations.

Bootstrapped and regtested on aarch64-none-linux-gnu. This work was also
tested on a cross-compiler targeting aarch64_be-none-linux-gnu.

CC'ing Kyrylo as it looks like this patch interferes with his earlier work -
I'm wondering what to do about simd/vabal_combine.c without losing coverage?

OK for stage-1?

Spencer

Spencer Abson (1):
  AArch64: Fold builtins with highpart args to highpart equivalent
[PR117850]

 gcc/config/aarch64/aarch64-builtin-pairs.def  |  77 ++
 gcc/config/aarch64/aarch64-builtins.cc| 232 ++
 .../aarch64/simd/fold_to_highpart_1.c | 708 ++
 .../aarch64/simd/fold_to_highpart_2.c |  82 ++
 .../aarch64/simd/fold_to_highpart_3.c |  80 ++
 .../aarch64/simd/fold_to_highpart_4.c |  77 ++
 .../aarch64/simd/fold_to_highpart_5.c |  71 ++
 .../gcc.target/aarch64/simd/vabal_combine.c   |  12 +-
 8 files changed, 1333 insertions(+), 6 deletions(-)
 create mode 100644 gcc/config/aarch64/aarch64-builtin-pairs.def
 create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/fold_to_highpart_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/fold_to_highpart_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/fold_to_highpart_3.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/fold_to_highpart_4.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/fold_to_highpart_5.c

-- 
2.34.1



Re: [PATCH] rx: allow cmpstrnsi len to be zero

2025-02-17 Thread Jeff Law




On 2/15/25 12:32 PM, Keith Packard wrote:



It's as reasonable as other methods such as turning it into a
define_expand and emitting a conditional branch around the sequence when
the count is zero.


Thanks much. I suspect the cost of the PSW setting instructions is far
less than a branch, so how about this version which emits them only when
the length is unknown and skips the instruction entirely when the length
is known to be zero?
I don't know anything about the micro-arch of the RX chip, so no clue 
which is likely faster.  Things like an explicit PSW write may be 
pipeline flushes on some designs -- which could be insanely painful.





This replaces the previous patch because it is easier to understand; if
this looks correct, I'll submit it as a patch on top of the previous
one. This also passes my picolibc CI tests.

diff --git a/gcc/config/rx/rx.md b/gcc/config/rx/rx.md
index 89211585c9c..d2424ca395e 100644
--- a/gcc/config/rx/rx.md
+++ b/gcc/config/rx/rx.md
@@ -2545,6 +2545,16 @@ (define_expand "cmpstrnsi"
 (match_operand:SI4 "immediate_operand")] ;; 
Known Align
"rx_allow_string_insns"
{
+bool const_len = CONST_INT_P(operands[3]);
+if (const_len)
+{
+  if (INTVAL(operands[3]) == 0)
Note that operand 3's predict is "register_operand", so I would be 
surprised to see a CONST_INT node show up in that operand.   There are 
some cases where we bypass those checks, so I'm not saying it can't 
happen, just that I'd be surprised if it does.  Seems like it'd be worth 
double checking.


If you are indeed able to get a CONST_INT node, then something like your 
patch is basically sensible.  For comparisons against zero and a few 
other special nodes we use CONST0_RTX (mode).


operands[3] != CONST0_RTX (mode)

Looking at the expander it copies operands[3] into "len" where "len" is 
always an SImode register.  So the test should be:


operands[3] != CONST0_RTX (SImode);

It's a nit in this context but in others that form will work better.

BUt I'd really like to get a confirmation that we can get a CONST_INT 
node in here before installing.


If you can't trigger it, you could always change the predicate from 
"register_operand" to something that also allows constants.  Then you 
just need to be sure that you can directly copy an arbitrary constant 
into a physical register.  I don't know the RX well enough to know if 
that's possible or not.


Jeff


Re: Ping: [PATCH] late-combine: Tighten register class check [PR108840]

2025-02-17 Thread Jeff Law




On 2/17/25 4:33 AM, Richard Sandiford wrote:

Ping
Feel through the cracks.  We really should acknowledge that you know 
this code better than anyone and shouldn't need to wait on review.


Anyway, LGTM.

jeff



[committed] i386: Simplify PARALLEL RTX scan in ix86_find_all_reg_use

2025-02-17 Thread Uros Bizjak
UNSPEC and UNSPEC_VOLATILE never store. Remove unnecessary checks and
simplify RTX scan in ix86_find_all_reg_use to scan only for SET RTX
in the PARALLEL.

gcc/ChangeLog:

* config/i386/i386.cc (ix86_find_all_reg_use):
Scan only for SET RTX in PARALLEL.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index fafd4a511a3..560e6525b56 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -8538,31 +8538,9 @@ ix86_find_all_reg_use (HARD_REG_SET &stack_slot_access,
   for (int i = 0; i < XVECLEN (pat, 0); i++)
{
  rtx exp = XVECEXP (pat, 0, i);
- switch (GET_CODE (exp))
-   {
-   case ASM_OPERANDS:
-   case CLOBBER:
-   case PREFETCH:
-   case USE:
- break;
-   case UNSPEC:
-   case UNSPEC_VOLATILE:
- for (int j = XVECLEN (exp, 0) - 1; j >= 0; j--)
-   {
- rtx x = XVECEXP (exp, 0, j);
- if (GET_CODE (x) == SET)
-   ix86_find_all_reg_use_1 (x, stack_slot_access,
-worklist);
-   }
- break;
-   case SET:
- ix86_find_all_reg_use_1 (exp, stack_slot_access,
-  worklist);
- break;
-   default:
- gcc_unreachable ();
- break;
-   }
+
+ if (GET_CODE (exp) == SET)
+   ix86_find_all_reg_use_1 (exp, stack_slot_access, worklist);
}
 }
 }


Re: [PATCH][_Hashtable] Fix hash code cache usage

2025-02-17 Thread François Dumont

Ping for this bug fix, would you like a PR ?

On 20/01/2025 22:12, François Dumont wrote:

Hi

In my work on fancy pointer support I've decided to always cache the 
hash code.


Doing so I spotted a bug in the management of this cache when hash 
functor is stateful.


    libstdc++: [_Hashtable] Fix hash code cache usage when hash 
functor is stateful


    It is wrong to reuse a cached hash code when this code depends 
then on the state

    of the Hash functor.

    Add checks that Hash functor is stateless before reusing the 
cached hash code.


    libstdc++-v3/ChangeLog:

    * include/bits/hashtable_policy.h 
(_Hash_code_base::_M_copy_code): Remove.

    * include/bits/hashtable.h (_M_copy_code): New.
    (_M_assign): Use latter.
    (_M_bucket_index_ex): New.
    (_M_equals): Use latter.
    * testsuite/23_containers/unordered_map/modifiers/merge.cc 
(test10): New

    test case.

Tested under Linux x64, ok to commit ?

François


Re: [PATCH] COBOL 6/15 156K lex: lexer

2025-02-17 Thread James K. Lowden
On Mon, 17 Feb 2025 15:35:16 -0500
David Malcolm  wrote:

> > > Have you tried running the compiler under valgrind?  Configure
> > > with ?enable-valgrind-annotations and pass -wrap per=valgrind to
> > > the driver.

> a benefit of my suggested approach is that if you *do* need to use
> valgrind at some point, it doesn't get swamped by noise from the
> frontend

I will try it and see how far we get.  Like you, I don't want the COBOL
front end to inconvenience anyone, if possible.  

--jkl


[PATCH v2] builtins: Ensure sin and cos properly set errno when INFINITY is passed [PR80042]

2025-02-17 Thread Peter Damianov
POSIX says that sin and cos should set errno to EDOM when infinity is passed to
them. Make sure this is accounted for in builtins.def, and add tests.

gcc/
PR middle-end/80042
* builtins.def: (sin|cos)(f|l) can set errno.
gcc/testsuite/
* gcc.dg/pr80042.c: New testcase.
---
V2:
Add { } to all dg directives
Fix cosl too (the test was failing on this and caught it)
Add -fmath-errno just in case (apparently Darwin might have problems without it)

$ make -k check-gcc-c RUNTESTFLAGS="dg.exp=pr80042.c"
shows:
# of expected passes2

so I think the tests are passing as expected.


 gcc/builtins.def   | 22 ++-
 gcc/testsuite/gcc.dg/pr80042.c | 71 ++
 2 files changed, 83 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr80042.c

diff --git a/gcc/builtins.def b/gcc/builtins.def
index 89fc74654ca..c7d2987a9c4 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -288,6 +288,8 @@ along with GCC; see the file COPYING3.  If not see
maintenance purposes.  */
 #undef ATTR_MATHFN_FPROUNDING_STORE
 #define ATTR_MATHFN_FPROUNDING_STORE ATTR_NOTHROW_LEAF_LIST
+#undef ATTR_MATHFN_FPROUNDING_ERRNO_STORE
+#define ATTR_MATHFN_FPROUNDING_ERRNO_STORE ATTR_NOTHROW_LEAF_LIST
 
 /* Define an attribute list for leaf functions that do not throw
exceptions normally, but may throw exceptions when using
@@ -350,14 +352,14 @@ DEF_C99_BUILTIN(BUILT_IN_COPYSIGNL, "copysignl", 
BT_FN_LONGDOUBLE_LONGDO
 #define COPYSIGN_TYPE(F) BT_FN_##F##_##F##_##F
 DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_COPYSIGN, "copysign", COPYSIGN_TYPE, 
ATTR_CONST_NOTHROW_LEAF_LIST)
 #undef COPYSIGN_TYPE
-DEF_LIB_BUILTIN(BUILT_IN_COS, "cos", BT_FN_DOUBLE_DOUBLE, 
ATTR_MATHFN_FPROUNDING)
-DEF_C99_C90RES_BUILTIN (BUILT_IN_COSF, "cosf", BT_FN_FLOAT_FLOAT, 
ATTR_MATHFN_FPROUNDING)
+DEF_LIB_BUILTIN(BUILT_IN_COS, "cos", BT_FN_DOUBLE_DOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
+DEF_C99_C90RES_BUILTIN (BUILT_IN_COSF, "cosf", BT_FN_FLOAT_FLOAT, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_LIB_BUILTIN(BUILT_IN_COSH, "cosh", BT_FN_DOUBLE_DOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_COSHF, "coshf", BT_FN_FLOAT_FLOAT, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_COSHL, "coshl", BT_FN_LONGDOUBLE_LONGDOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 #define COSH_TYPE(F) BT_FN_##F##_##F
 DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_COSH, "cosh", COSH_TYPE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
-DEF_C99_C90RES_BUILTIN (BUILT_IN_COSL, "cosl", BT_FN_LONGDOUBLE_LONGDOUBLE, 
ATTR_MATHFN_FPROUNDING)
+DEF_C99_C90RES_BUILTIN (BUILT_IN_COSL, "cosl", BT_FN_LONGDOUBLE_LONGDOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_COS, "cos", COSH_TYPE, 
ATTR_MATHFN_FPROUNDING)
 DEF_EXT_LIB_BUILTIN(BUILT_IN_DREM, "drem", BT_FN_DOUBLE_DOUBLE_DOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_EXT_LIB_BUILTIN(BUILT_IN_DREMF, "dremf", BT_FN_FLOAT_FLOAT_FLOAT, 
ATTR_MATHFN_FPROUNDING_ERRNO)
@@ -674,18 +676,18 @@ DEF_EXT_LIB_BUILTIN(BUILT_IN_SIGNBITD128, 
"signbitd128", BT_FN_INT_DFLOAT128
 DEF_EXT_LIB_BUILTIN(BUILT_IN_SIGNIFICAND, "significand", 
BT_FN_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_EXT_LIB_BUILTIN(BUILT_IN_SIGNIFICANDF, "significandf", 
BT_FN_FLOAT_FLOAT, ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_EXT_LIB_BUILTIN(BUILT_IN_SIGNIFICANDL, "significandl", 
BT_FN_LONGDOUBLE_LONGDOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
-DEF_LIB_BUILTIN(BUILT_IN_SIN, "sin", BT_FN_DOUBLE_DOUBLE, 
ATTR_MATHFN_FPROUNDING)
-DEF_EXT_LIB_BUILTIN(BUILT_IN_SINCOS, "sincos", 
BT_FN_VOID_DOUBLE_DOUBLEPTR_DOUBLEPTR, ATTR_MATHFN_FPROUNDING_STORE)
-DEF_EXT_LIB_BUILTIN(BUILT_IN_SINCOSF, "sincosf", 
BT_FN_VOID_FLOAT_FLOATPTR_FLOATPTR, ATTR_MATHFN_FPROUNDING_STORE)
-DEF_EXT_LIB_BUILTIN(BUILT_IN_SINCOSL, "sincosl", 
BT_FN_VOID_LONGDOUBLE_LONGDOUBLEPTR_LONGDOUBLEPTR, ATTR_MATHFN_FPROUNDING_STORE)
-DEF_C99_C90RES_BUILTIN (BUILT_IN_SINF, "sinf", BT_FN_FLOAT_FLOAT, 
ATTR_MATHFN_FPROUNDING)
+DEF_LIB_BUILTIN(BUILT_IN_SIN, "sin", BT_FN_DOUBLE_DOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
+DEF_EXT_LIB_BUILTIN(BUILT_IN_SINCOS, "sincos", 
BT_FN_VOID_DOUBLE_DOUBLEPTR_DOUBLEPTR, ATTR_MATHFN_FPROUNDING_ERRNO_STORE)
+DEF_EXT_LIB_BUILTIN(BUILT_IN_SINCOSF, "sincosf", 
BT_FN_VOID_FLOAT_FLOATPTR_FLOATPTR, ATTR_MATHFN_FPROUNDING_ERRNO_STORE)
+DEF_EXT_LIB_BUILTIN(BUILT_IN_SINCOSL, "sincosl", 
BT_FN_VOID_LONGDOUBLE_LONGDOUBLEPTR_LONGDOUBLEPTR, 
ATTR_MATHFN_FPROUNDING_ERRNO_STORE)
+DEF_C99_C90RES_BUILTIN (BUILT_IN_SINF, "sinf", BT_FN_FLOAT_FLOAT, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_LIB_BUILTIN(BUILT_IN_SINH, "sinh", BT_FN_DOUBLE_DOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_SINHF, "sinhf", BT_FN_FLOAT_FLOAT, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_SINHL, "sinhl", BT_FN_LONGDOUBLE_LONGDOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 #define SINH_TYPE(F) BT_FN_##F##_##F
 DE

Re: [PATCH] COBOL 8/15 360K cbl: parser support

2025-02-17 Thread James K. Lowden
On Mon, 17 Feb 2025 15:13:21 -0500
David Malcolm  wrote:

> > How do I do that?  I barely know the term; I have to look it up
> > every time.  I don't find "sarif" anywhere in gcc.info or
> > gccint.info.  
> 
> (caveat: SARIF is one of my particular interests and thus I'm biased
> towards it; not a blocker for first release, but needs to eventually
> work)

Features need champions, right?  

> It's been in gcc.info since GCC 13, 

Indeed, my bad.  

> $ info build/gcc/doc/gccint.info | grep -iw sarif
> 'scan-sarif-file REGEXP [{ target/xfail SELECTOR }]'
>  '-fdiagnostics-format=sarif-file'.
> 'scan-sarif-file-not REGEXP [{ target/xfail SELECTOR }]'
>  '-fdiagnostics-format=sarif-file'.

I was referencing gcc-11 while wandering in the desert, apparently.  

> In trunk, add
>   -fdiagnostics-add-output=sarif,

I'll be sure to do this during the warning-enumeration task. 

--jkl


Re: [PATCH] COBOL 8/15 360K cbl: parser support

2025-02-17 Thread James K. Lowden
On Mon, 17 Feb 2025 15:02:35 -0500
David Malcolm  wrote:

> > (Maybe zero_option_id would be a better name?)
> 
> Ah - yes, now I see what you mean.  I like that name.
> 
> Can it be "const"?

Already is!  I renamed it to "option_zero" to prevent future confusion.  

--jkl



Re: [PATCH] COBOL 6/15 156K lex: lexer

2025-02-17 Thread James K. Lowden
On Mon, 17 Feb 2025 15:28:48 -0500
David Malcolm  wrote:

> We shouldn't rely on assert to do checking of user-controllable input;
> it should always be checked.

Quite so.  I think you'll like the change.  

--jkl





Re: [PATCH v3 0/8] LoongArch: SIMD odd/even/horizontal widening arithmetic cleanup and optimization

2025-02-17 Thread Lulu Cheng



在 2025/2/14 下午8:21, Xi Ruoyao 写道:

This series is intended to fix some test failures on
vect-reduc-chain-*.c by adding the [su]dot_prod* expand for LSX and LASX
vector modes.  But the code base of the related instructions was not
readable, so clean it up first (using the approach learnt from AArch64)
before adding the expands.

v2 => v3:

- Move the introduction of V1TI and V2TI from patch 3 to patch 2, so
   each commit is buildable.

v1 => v2:

- Only simplify vpick{ev,od}, not xvpick{ev,od} (where
   vect_par_cnst_even_or_odd_half is not suitable).
- Keep {sign,zero}_extend out of vec_select.
- Remove vect_par_cnst_{even,odd}_half for simd_hw__,
   to simplify the code and allow it to match the RTL in case the even
   half is selected for the left operand of addsub.  Swap the operands if
   needed when outputting the asm.
- Fix typos in commit subjects.
- Mention V2TI in loongarch-modes.def.

v2 bootstrapped and regtested on loongarch64-linux-gnu, no new code
change in v3.  Ok for trunk?


LGTM.

Thanks!


Xi Ruoyao (8):
   LoongArch: Try harder using vrepli instructions to materialize const
 vectors
   LoongArch: Allow moving TImode vectors
   LoongArch: Simplify {lsx_,lasx_x}v{add,sub,mul}l{ev,od} description
   LoongArch: Simplify {lsx_,lasx_x}vh{add,sub}w description
   LoongArch: Simplify {lsx_,lasx_x}vmaddw description
   LoongArch: Simplify lsx_vpick description
   LoongArch: Implement vec_widen_mult_{even,odd}_* for LSX and LASX
 modes
   LoongArch: Implement [su]dot_prod* for LSX and LASX modes

  gcc/config/loongarch/constraints.md   |2 +-
  gcc/config/loongarch/lasx.md  | 1070 +
  gcc/config/loongarch/loongarch-builtins.cc|   60 +
  gcc/config/loongarch/loongarch-modes.def  |5 +-
  gcc/config/loongarch/loongarch-protos.h   |3 +
  gcc/config/loongarch/loongarch.cc |   50 +-
  gcc/config/loongarch/loongarch.md |2 +-
  gcc/config/loongarch/lsx.md   | 1006 +---
  gcc/config/loongarch/predicates.md|   27 +
  gcc/config/loongarch/simd.md  |  390 +-
  gcc/testsuite/gcc.target/loongarch/vrepli.c   |   15 +
  .../gcc.target/loongarch/wide-mul-reduc-1.c   |   18 +
  .../gcc.target/loongarch/wide-mul-reduc-2.c   |   18 +
  13 files changed, 612 insertions(+), 2054 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/loongarch/vrepli.c
  create mode 100644 gcc/testsuite/gcc.target/loongarch/wide-mul-reduc-1.c
  create mode 100644 gcc/testsuite/gcc.target/loongarch/wide-mul-reduc-2.c





Re: [PATCH] COBOL 8/15 360K cbl: parser support

2025-02-17 Thread David Malcolm
On Mon, 2025-02-17 at 12:42 -0500, James K. Lowden wrote:
> On Sat, 15 Feb 2025 23:37:20 -0500
> David Malcolm  wrote:
> 
> > +  rich_location richloc (line_table, token_location);
> > +  bool ret = global_dc->diagnostic_impl (&richloc, nullptr,
> > option_id,
> > + gmsgid, &ap, DK_ERROR);
> > +  va_end (ap);
> > +  global_dc->end_group();
> > +}
> > 
> > For errors, just pass 0 as the diagnostic_option_id.  Same for the
> > various DK_SORRY and DK_FATAL.
> 
> OK, but is this a style thing?  That's effectively what happens,
> using a name.  
> 
> option_id is a file-scope static constant, initialized to 0.  Instead
> of passing an integer that the compiler uses to construct a temporary
> diagnostic_option_id, we pass an already-constructed
> diagnostic_option_id by value.  
> 
> (Maybe zero_option_id would be a better name?)

Ah - yes, now I see what you mean.  I like that name.

Can it be "const"?

> 
> > +bool
> > +yywarn( const char gmsgid[], ... ) {
> > +  verify_format(gmsgid);
> > +  auto_diagnostic_group d;
> > +  va_list ap;
> > +  va_start (ap, gmsgid);
> > +  auto ret = emit_diagnostic_valist( DK_WARNING, token_location,
> > + option_id, gmsgid, &ap );
> > +  va_end (ap);
> > +  return ret;
> > +}
> > 
> > For warnings, ideally this should take a diagnostic_option_id
> > controlling the warning as the initial parameter, rather than have
> > a
> > global variable for this.  
> 
> Yes, absolutely.  That's on the to do list.  I wanted to get a set of
> patches submitted for consideration, and drew the line ahead of that
> item.  
> 
> > Is this something that yacc is imposing on you?
> 
> Not at all.  I need to go into gcc/cobol/lang.opt and enumerate the
> warnings.  Then I need to pass the warning ID into yywarn (which will
> be renamed warn_msg() because the "yy" prefix is properly reserved
> for yacc).  
> 
> As we say, just a small matter of programming.  :-) 

Indeed.  It sounds like this is moving in the right direction, and FWIW
I don't have a problem with it going into trunk in its current state.

Dave



RE: [PATCH] COBOL 8/15 360K cbl: parser support

2025-02-17 Thread Robert Dubner



> -Original Message-
> From: Richard Biener 
> Sent: Monday, February 17, 2025 08:24
> To: James K. Lowden 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] COBOL 8/15 360K cbl: parser support
>
> On Sat, Feb 15, 2025 at 10:08 PM James K. Lowden
>  wrote:
> >
> > From 5d53920602e234e4d99ae2d502e662ee3699978e 4 Oct 2024 12:01:22 -0400
> > From: "James K. Lowden" 
> > Date: Sat 15 Feb 2025 12:50:53 PM EST
> > Subject: [PATCH] 9 new 'cobol' FE files
> >
> > gcc/cobol/ChangeLog
> > * cdf.y: New file.
> > * cobol1.cc: New file.
> > * convert.cc: New file.
> > * except.cc: New file.
> > * gcobolspec.cc: New file.
> > * structs.cc: New file.
> > * symbols.cc: New file.
> > * symfind.cc: New file.
> > * util.cc: New file.

I have trimmed away everything; I hope that's not too radical.

I have been addressing many of the comments in your four messages.  I just 
committed a bunch of changes to our repository; here is the ChangeLog entry:

2025-02-17  Robert Dubner 
* Moved #include  from genapi.cc to cobol-system.h as
#include 
* Removed GCOBOL_FOR_TARGET from /Makefile.def
* Removed if $USER = "bob" stuff from cobol/Make-lang.in
* Backed -std=c++17 down to c++14 in cobol/Make-lang.in
* Removed the single c++17 dependency from show_parse.h ANALYZER
* Removed -Wno-cpp from cobol/Make-lang.in
* Removed Wno-missing-field-initializers from cobol/Make-lang.in
* Added some informative comments to placeholder functions in cobol1.cc
* Removed a call to build_tree_list() in cobol1.cc
* Use default for LANG_HOOKS_TYPE_FOR_SIZE in cobol1.cc
* Commented out, but saved, unused code in convert.cc
* Eliminated numerous "-Wmissing-field-initializers" warnings

Jim will very shortly be using those committed changes to create a third set 
of patches.

Within the last couple of hours, I used the current repository to 
"../configure --enable-languages=all" followed by a "make -sj `nproc`"

I am happy to report that the resulting bootstrapped multilib builds 
succeeded on both x86_64 and aarch64 architectures.

Again: Thank you so much.  I think I've said I am new to open software 
development.  The spirit of cooperation is eye opening.


Re: [PATCH] middle-end: Fixup constant integers when expanding __builtin_crc [PR118288]

2025-02-17 Thread Jeff Law




On 2/16/25 2:07 PM, Uros Bizjak wrote:

Constant integers with MSB set have to be represented as corresponding
signed integers.  Use gen_int_mode to emit them in the correct way.

 PR middle-end/118288

gcc/ChangeLog:

 * builtins.cc (expand_builtin_crc_table_based):
 Use gen_int_mode to emit constant integers with MSB set.

gcc/testsuite/ChangeLog:

 * gcc.dg/pr118288.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

OK for mainline?

Definitely OK.  Thanks for chasing this down.



jeff


Re: [PATCH] COBOL 7/15 492K par: parser

2025-02-17 Thread David Malcolm
On Mon, 2025-02-17 at 13:42 -0500, James K. Lowden wrote:
> On Sat, 15 Feb 2025 23:35:16 -0500
> David Malcolm  wrote:
> 
> On better messages ...
> 
> > +  if( ($$ & $2) == $2 ) {
> > +    error_msg(@2, "%s clause repeated", clause);
> > +    YYERROR;
> > +  }
> > 
> > Obviously not needed for initial release, but it would be neat to
> > have
> > a fix-it hint here that deletes the repeated token (fixit hints are
> > done via class rich_location, FWIW)
> 
> Noted, thanks, 
> 
> > +  if( $data_clause == redefines_clause_e ) {
> > +    error_msg(@2, "REDEFINES must appear "
> > + "immediately after LEVEL and NAME");
> > +    YYERROR;
> > +  }
> > 
> > A strict reading of our diagnostic guidelines suggests that all of
> > these keywords in these messages should be in quotes, via %{ and
> > %}, or
> > via %qs.  But given that cobol has UPPERCASE KEYWORDS THAT ALREADY
> > REALLY STAND OUT, maybe that?s overkill.
> 
> I endeavored to report every keyword in uppercase.  The user isn't
> required to use uppercase; COBOL is (despite the official name, heh)
> case-insensitive.  But the fact of the keyword is all we have.  The
> lexer doesn't capture how the user typed it in; it reports only the
> presence of the token.  
> 
> In the case of user-defined names, the actual name supplied is
> captured and reported literally.  So, the user could have e.g. 
> 
>   001-Initialization Section.
> 
> where "Section" is a token whose input string is discarded, and 
> "001-Initialization" is a user-defined name whose supplied form is
> preserved, and reported verbatim.  
> 
> With that in mind, I propose a policy that builds on your observation
> (for gcc-16, not today):  Report token names in uppercase, unquoted,
> and user-defined names vebatim, quoted.  

Sounds reasonable to me.

> 
> I have that filed under "tasks for an eager volunteer, but probably
> me".  More tedious than difficult.  
> 
> > +  error_msg(@2, "%s is binary NUMERIC type, "
> > +   "incompatible with SIGN IS", field-
> > >name);
> > 
> > Again, this isn?t needed for the initial release, but GCCs
> > diagnostics
> > can have ?rules? associated with them, which can have URLs (see
> > diagnostic-metadata.h)  Is there a useful public standard for Cobol
> > with such rules that the output can link to?
> 
> There is no freely available COBOL standard.  IBM and Microfocus (and
> others) do publish their documentation on the  web, but the official
> standard is copyrighted and comes at a price.  Please write to your
> congressman.  
> 
> That said, it's been my ambition to tie every relevant message to the
> ISO standard in force at time of compilation.  To that end, I want to
> move all messages to a table keyed by ISO version and section number
> (or other, for -dialect option).  The caller would refer to the table
> by the key, and error_msg() et al. would report that information
> along with the message text.

For reference, I posted a patch to do this for the C++ frontend here:
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669105.html and
we do something similar in the rust frontend with error codes (see
class rust_error_code_rule in gcc/rust/rust-diagnostics.cc).

But as I said, this isn't at all needed for the initial release, of
course; just trying to be fancy.

>   
> 
> I don't know of another compiler that does that.  I don't mind
> showing them how it's done!  

Indeed, it's good to lead :)

[...snip...]

Dave



Re: [PATCH] COBOL 6/15 156K lex: lexer

2025-02-17 Thread David Malcolm
On Mon, 2025-02-17 at 12:42 -0500, James K. Lowden wrote:
> On Sat, 15 Feb 2025 23:32:37 -0500
> David Malcolm  wrote:
> 
> > > +  free(copier);
> > 
> > There?s a manual free of "copier" here, but there?s are various
> > error-
> > handling early returns paths that will leak. Maybe just use a
> > std::string?
> > 
> > Similarly with ?path?; I think this is always leaked. Maybe
> > std::string here too.
> > 
> > Have you tried running the compiler under valgrind?  Configure with
> > ?enable-valgrind-annotations and pass -wrap per=valgrind to the
> > driver.
> 
> It's no accident, comrade.  ;-) 
> 
> My design criterion: the parser's memory requirements are linear with
> the input.  As grows the COBOL text, so grows the memory
> consumption.  Any more would be wrong; any less is pointless.  
> 
> The parser makes a single pass over the input.  It can't "leak"
> except in a loop.  In general it therefore never calls free(3).  Only
> when a string is being built up of consecutive allocations in a loop
> do we take any care with free.  
> 
> Before anyone suggests that's wasteful of memory, let's remind
> ourselves of the compiler's design.  The input text is transformed
> into a GENERIC tree.  The compiled program *must* fit in memory. 
> Much of that input needs to be retained as debug strings and supplied
> values.  
> 
> We have tested gcobol on large COBOL inputs, hundreds of thousands of
> lines.  gcobol compiles those programs, as is, without freeing
> memory, on virtual machines that can barely link cc1.  
> 
> In the instant case, when open_file() fails, the compiler will soon
> terminate for lack of input.  Code generation will not be engaged. 
> The lost strings are literally no concern at all.  (I can see why
> that might not be obvious on first reading, and I hope I don't sound
> harsh or dismissive.)  
> 
> As for std::string, it would only complicate the front end.  Most
> strings in the parser are either fed back in to some C API or become
> parts of GENERIC nodes.  I admit having been tempted by
> std::stringstream for concatenation but in the end asprintf did most
> of what was needed.  
> 
> > Have you tried running the compiler under valgrind?  Configure with
> > ?enable-valgrind-annotations and pass -wrap per=valgrind to the
> > driver.
> 
> We have not tried that incantation, no.  We used valgrind for
> corruption problems, both times.  ;-)   We measured performance and
> memory use with the excellent Linux perf tool.  That is how we found
> the astonishing problems with std::regex (and also with how we were
> using it).  
> 
> I hope you see that we're taking care, but not too much care, with
> memory.  If there is a loop that is missing a free, we'll fix it, but
> I bet it won't matter, because these are just string fragments for
> filenames and such.  To dutifully call free for every allocated scrap
> of parsed input would be counterproductive: error-prone, and little
> if anything saved.  

Understood.

The rest of the compiler is relatively "clean" w.r.t. valgrind, so when
we do get leaks, valgrind shows them up clearly.  So a benefit of my
suggested approach is that if you *do* need to use valgrind at some
point, it doesn't get swamped by noise from the frontend.  But you can
probably filter out all allocations originating from the frontend and
get similar benefits.

Dave



[Patch] OpenMP/Fortran: extend 'adjust_args' clause, fixes for it and declare variant [PR115271]

2025-02-17 Thread Tobias Burnus

Hi all,

on the fixes side: If a function only appeared in an INTERFACE block,
the declare variant handling wasn't triggered - i.e. all diagnostic
handled there wasn't.

Additionally, when it was written as such in a module - and the module
got used, it wasn't active such that the wrong (the non variant function)
was called.
This patch handles the INTERFACE block issue of PR115271. The problem
that declare variant is not saved in the .mod file still remains.

Additionally, when looking at the code, I found a superfluous and wrong
check for 'dispatch' - rejecting potentially valid code (example of such
included).

And it add some feature support by implementing OpenMP 6.0's
adjust_args changes - namely, taking an integer (literal?) instead of
the dummy argument name - or a numeric range with const expressions and/or
'omp_num_args'. And 'need_device_addr' - however, it currently stops before
actually handling the latter by printing a sorry, not yet implemented.
[It needs some larger tweaks to handle optional + array descriptors properly.
For the C/C++ side, see also PR c++/118859.]

Finally, 'type(C_ptr) :: array(:)' is now rejected with need_device_ptr
as that's not a simple pointer but uses an array descriptor. That is
supposed to work with need_device_addr, though. On the OpenMP side,
that was fixed after 6.0 via OpenMP spec Issue #4443.

Comments, remarks, suggestions before I commit it?
(Build & regtested on x86-64_gnu-linux without offloading.

Tobias

PS: Follow-up work to be done in that area:
- Writing 'omp declare variant' to .mod files → PR115271
  [wrong-code issue]
- Handling need_device_addr + fixing wrong-code part of
  PR c++/118859 (relate; fixing the diagnostic should also be done)
  [wrong-code + feature issue; for C++ also diagnostic/accepts-invalid]
- Adding some more valid testcases (as part of ↑?)
  [also useful]
OpenMP/Fortran: extend 'adjust_args' clause, fixes for it and declare variant [PR115271]

On the extension side, it implements OpenMP 6.0's numeric values/ranges for
the adjust_args arguments, including 'omp_num_args'. And it adds parser
support for need_device_addr. It also implements the post-OpenMP-6.0
clarification of OpenMP spec Issue #4443 regarding type(c_ptr) with
dimension being invalid for need_device_ptr.

To be done: Adding full support for need_device_addr (optional, array
descriptor, ...).

On the invalid side, it removed a bogus c_ptr check that went through
all adjust_args without checking for need_device_ptr and the current scope.

And it finally also processes 'declare variant' in an INTERFACE block,
which is part of PR115271, but it does not handle .mod file yet - the
main issue tracked in that PR.

	PR fortran/115271

gcc/fortran/ChangeLog:

	* gfortran.h (gfc_omp_namelist): Change need_device_ptr to adj_args
	union and add more flags.
	* openmp.cc (gfc_match_omp_declare_variant,
	gfc_resolve_omp_declare): For adjust_args, handle need_device_addr
	and numeric values/ranges besides dummy argument names.
	(resolve_omp_dispatch): Remove bogus a adjust_args check.
	* trans-decl.cc (gfc_handle_declare_variant): New.
	(gfc_generate_module_vars, gfc_generate_function_code): Call it.
	* trans-openmp.cc (gfc_trans_omp_declare_variant): Handle numeric
	values/ranges besides dummy argument names.

gcc/testsuite/ChangeLog:

	* gfortran.dg/gomp/adjust-args-1.f90: Update dg-.* expectations.
	* gfortran.dg/gomp/adjust-args-2.f90: Likewise.
	* gfortran.dg/gomp/adjust-args-2a.f90: Likewise.
	* gfortran.dg/gomp/adjust-args-3.f90: Likewise.
	* gfortran.dg/gomp/adjust-args-4.f90: Remove array from c_ptr.
	* gfortran.dg/gomp/adjust-args-5.f90: Likewise.
	* gfortran.dg/gomp/adjust-args-11.f90: Likewise. Add check that
	INTERFACE is now handled in subroutines and in modules.
	* gfortran.dg/gomp/adjust-args-13.f90: New test.
	* gfortran.dg/gomp/adjust-args-14.f90: New test.
	* gfortran.dg/gomp/adjust-args-15.f90: New test.
	* gfortran.dg/gomp/declare-variant-21.f90: New test.

 gcc/fortran/gfortran.h |  10 +-
 gcc/fortran/openmp.cc  | 247 +
 gcc/fortran/trans-decl.cc  |  23 ++
 gcc/fortran/trans-openmp.cc| 212 ++
 gcc/testsuite/gfortran.dg/gomp/adjust-args-1.f90   |   8 +-
 gcc/testsuite/gfortran.dg/gomp/adjust-args-11.f90  |  77 ++-
 gcc/testsuite/gfortran.dg/gomp/adjust-args-13.f90  |  18 ++
 gcc/testsuite/gfortran.dg/gomp/adjust-args-14.f90  |  85 +++
 gcc/testsuite/gfortran.dg/gomp/adjust-args-15.f90  |  35 +++
 gcc/testsuite/gfortran.dg/gomp/adjust-args-2.f90   |   3 +-
 gcc/testsuite/gfortran.dg/gomp/adjust-args-2a.f90  |   8 +-
 gcc/testsuite/gfortran.dg/gomp/adjust-args-3.f90   |   4 +-
 gcc/testsuite/gfortran.dg/gomp/adjust-args-4.f90   |   8 +-
 gcc/testsuite/gfortran.dg/gomp/adjust-args-5.f90   |   8 +-
 .../gfortran.dg/gomp/declare-variant-21.f90|  20 ++
 15 files changed, 660 insertions(+), 106 deletions(-)

diff 

Re: [PATCH] COBOL 8/15 360K cbl: parser support

2025-02-17 Thread David Malcolm
On Mon, 2025-02-17 at 12:49 -0500, James K. Lowden wrote:
> On Sat, 15 Feb 2025 23:37:20 -0500
> David Malcolm  wrote:
> 
> > +const char *
> > +cobol_get_sarif_source_language(const char *)
> > +    {
> > +    return "cobol";
> > +    }
> > 
> > Out of curiosity, did you try the SARIF output?  This is a good
> > test
> > for whether you?re properly using the GCC diagnostics subsystem.
> 
> How do I do that?  I barely know the term; I have to look it up every
> time.  I don't find "sarif" anywhere in gcc.info or gccint.info.  

(caveat: SARIF is one of my particular interests and thus I'm biased
towards it; not a blocker for first release, but needs to eventually
work)

It's been in gcc.info since GCC 13, I believe (are you looking at the
generated gcc.info, or at one installed on the system from an earlier
release of gcc?).

In trunk, add
  -fdiagnostics-add-output=sarif,
to the command-line options.  With that, in addition to regular text
diagnostic output, you should get a .sarif file written out containing
the diagnostics (and other metadata) in machine-readable json form
(unless you've got code doing fprintf to stderr, that is, in which case
the resulting mixture might not be well-formed JSON, of course).

Alternatively:
  -fdiagnostics-format=sarif-stderr
will replace the regular textual output with the machine-readable
output.

Ideally all the diagnostics should show up in the machine-readable
form.

For reference, https://gcc.gnu.org/wiki/SARIF has for more info on
GCC's sarif support.

Hope this is helpful
Dave



Re: [PATCH] FreeBSD: Stop linking _p libs for -pg as of FreeBSD 14

2025-02-17 Thread Gerald Pfeifer
I now also pushed this back to the gcc-14 and gcc-13 release branches.

(The gcc-12 branch is presumably to end in a couple of months, so I have 
not pushed it there yet, but can do so if there is desire.)


Sorry this fell through the cracks originally - and thank you for the
contribution, Andreas!

Gerald


On Sun, 9 Jun 2024, Gerald Pfeifer wrote:
> On Fri, 13 Aug 2021, Andreas Tobler via Gcc-patches wrote:
>> I would like to commit the attached patch to trunk and after a settling 
>> period also to all open branches.
>> Is this ok?
> Our MAINTAINERS file has the following entry:
> 
>   freebsd   Andreas Tobler   
> 
> So ... yes. :-)
>  
> Seeing this did not make it into our tree, I applied the patchset,
> bootstrapped on x86_64-unknown-freebsd13.2 and pushed with a minor 
> simplification to the ChangeLog. Patch as pushed below...
> 
> Gerald
> 
> 
> commit 48abb540701447b0cd9df7542720ab65a34fc1b1
> Author: Andreas Tobler 
> Date:   Sun Jun 9 23:18:04 2024 +0200
> 
> FreeBSD: Stop linking _p libs for -pg as of FreeBSD 14
> 
> As of FreeBSD version 14, FreeBSD no longer provides profiled system
> libraries like libc_p and libpthread_p. Stop linking against them if
> the FreeBSD major version is 14 or more.
> 
> gcc:
> * config/freebsd-spec.h: Change fbsd-lib-spec for FreeBSD > 13,
> do not link against profiled system libraries if -pg is invoked.
> Add a define to note about this change.
> * config/aarch64/aarch64-freebsd.h: Use the note to inform if
> -pg is invoked on FreeBSD > 13.
> * config/arm/freebsd.h: Likewise.
> * config/i386/freebsd.h: Likewise.
> * config/i386/freebsd64.h: Likewise.
> * config/riscv/freebsd.h: Likewise.
> * config/rs6000/freebsd64.h: Likewise.
> * config/rs6000/sysv4.h: Likeise.
> 
> diff --git a/gcc/config/aarch64/aarch64-freebsd.h 
> b/gcc/config/aarch64/aarch64-freebsd.h
> index 53cc17a1caf..e26d69ce46c 100644
> --- a/gcc/config/aarch64/aarch64-freebsd.h
> +++ b/gcc/config/aarch64/aarch64-freebsd.h
> @@ -35,6 +35,7 @@
>  #undef  FBSD_TARGET_LINK_SPEC
>  #define FBSD_TARGET_LINK_SPEC " \
>  %{p:%nconsider using `-pg' instead of `-p' with gprof (1)}  \
> +" FBSD_LINK_PG_NOTE "\
>  %{v:-V} \
>  %{assert*} %{R*} %{rpath*} %{defsym*}   \
>  %{shared:-Bshareable %{h*} %{soname*}}  \
> diff --git a/gcc/config/arm/freebsd.h b/gcc/config/arm/freebsd.h
> index 9d0a5a842ab..ee4860ae637 100644
> --- a/gcc/config/arm/freebsd.h
> +++ b/gcc/config/arm/freebsd.h
> @@ -47,6 +47,7 @@
>  #undef   LINK_SPEC
>  #define LINK_SPEC "  \
>%{p:%nconsider using `-pg' instead of `-p' with gprof (1)} \
> +  " FBSD_LINK_PG_NOTE "  
> \
>%{v:-V}\
>%{assert*} %{R*} %{rpath*} %{defsym*}  
> \
>%{shared:-Bshareable %{h*} %{soname*}} \
> diff --git a/gcc/config/freebsd-spec.h b/gcc/config/freebsd-spec.h
> index a6d1ad1280f..f43056bf2cf 100644
> --- a/gcc/config/freebsd-spec.h
> +++ b/gcc/config/freebsd-spec.h
> @@ -92,19 +92,29 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> If not, see
> libc, depending on whether we're doing profiling or need threads support.
> (similar to the default, except no -lg, and no -p).  */
>  
> +#if FBSD_MAJOR < 14
> +#define FBSD_LINK_PG_NOTHREADS   "%{!pg: -lc}  %{pg: -lc_p}"
> +#define FBSD_LINK_PG_THREADS "%{!pg: %{pthread:-lpthread} -lc} " \
> + "%{pg: %{pthread:-lpthread} -lc_p}"
> +#define FBSD_LINK_PG_NOTE ""
> +#else
> +#define FBSD_LINK_PG_NOTHREADS "%{-lc} "
> +#define FBSD_LINK_PG_THREADS   "%{pthread:-lpthread} -lc "
> +#define FBSD_LINK_PG_NOTE "%{pg:%nFreeBSD no longer provides profiled "\
> +   "system libraries}"
> +#endif
> +
>  #ifdef FBSD_NO_THREADS
>  #define FBSD_LIB_SPEC "  
> \
>%{pthread: %eThe -pthread option is only supported on FreeBSD when gcc \
>  is built with the --enable-threads configure-time option.}   \
>%{!shared: \
> -%{!pg: -lc}  
> \
> -%{pg:  -lc_p}\
> +" FBSD_LINK_PG_NOTHREADS "   
> \
>}"
>  #else
>  #define FBSD_LIB_SPEC "  
> \
>%{!shared: \
> -%{!pg: %{pthread:-lpthre

Re: [PATCH] arm: Increment LABEL_NUSES when using minipool_vector_label

2025-02-17 Thread Richard Earnshaw (lists)
On 13/02/2025 21:43, H.J. Lu wrote:
> Increment LABEL_NUSES when using minipool_vector_label to avoid the zero
> use count on minipool_vector_label.
> 
> PR target/118866
> * config/arm/arm.cc (arm_reorg): Increment LABEL_NUSES when
> using minipool_vector_label.
> 

Whilst this patch isn't wrong per se, I'm concerned that it's likely due to 
something else violating the assumptions that a TARGET_MACHINE_DEPENDENT_REORG 
pass implementation is entitled to make.  On arm, the insertion of minipools in 
the code has to assume that the BB layout won't change after that point 
(otherwise the offset calculations will be wrong).  In fact, only changes that 
reduce code size within a single basic block are going to be safe at this point.

So what's changed to make this patch needed, and is it being run too late?

R.


[wwwdocs] gcc-15/changes: Mention the new -mveclibabi=aocl option in the IA-32/x86-64 section

2025-02-17 Thread Filip Kastl
Hi,

I'm mentioning a change I made in the gcc-15/changes.html file.  Validated with
the W3 Validator.  Is this ok to be pushed?

Cheers,
Filip Kastl


-- 8< --


---
 htdocs/gcc-15/changes.html | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/htdocs/gcc-15/changes.html b/htdocs/gcc-15/changes.html
index 853fad03..e8ff31da 100644
--- a/htdocs/gcc-15/changes.html
+++ b/htdocs/gcc-15/changes.html
@@ -584,6 +584,12 @@ asm (".text; %cc0: mov %cc2, %%r0; .previous;"
   -mavx512pf, -mprefetchwt1,
   -mtune=knl, and -mtune=knm compiler switches.
   
+  GCC now supports generating vectorized math calls to the math library
+from AMD Optimizing CPU Libraries (AOCL LibM). This option is available
+through the -mveclibabi=aocl compiler switch. GCC continues to
+support generating calls to AMD Core Math Library (ACML). However, that
+library is end-of-life and AOCL offers many more vectorized functions.
+  
 
 
 
-- 
2.47.1



[PING][PATCH v2] libcpp: Fix incorrect line numbers in large files [PR108900]

2025-02-17 Thread Yash . Shinde
From: Yash Shinde 

This patch addresses an issue in the C preprocessor where incorrect line number 
information is generated when processing
files with a large number of lines. The problem arises from improper handling 
of location intervals in the line map,
particularly when locations exceed LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES.

By ensuring that the highest location is not decremented if it would move to a 
different ordinary map, this fix resolves
the line number discrepancies observed in certain test cases. This change 
improves the accuracy of line number reporting,
benefiting users relying on precise code coverage and debugging information.

Signed-off-by: Jeremy Bettis 
Signed-off-by: Yash Shinde 
---
 libcpp/files.cc | 8 
 1 file changed, 8 insertions(+)

diff --git a/libcpp/files.cc b/libcpp/files.cc
index 1ed19ca..3e6ca119ad5 100644
--- a/libcpp/files.cc
+++ b/libcpp/files.cc
@@ -1046,6 +1046,14 @@ _cpp_stack_file (cpp_reader *pfile, _cpp_file *file, 
include_type type,
&& type < IT_DIRECTIVE_HWM
&& (pfile->line_table->highest_location
!= LINE_MAP_MAX_LOCATION - 1));
+
+  if (decrement && LINEMAPS_ORDINARY_USED (pfile->line_table))
+{
+  const line_map_ordinary *map = LINEMAPS_LAST_ORDINARY_MAP 
(pfile->line_table);
+  if (map && map->start_location == pfile->line_table->highest_location)
+   decrement = false;
+}
+
   if (decrement)
 pfile->line_table->highest_location--;
 
-- 
2.43.0



Re: [r15-7532 Regression] FAIL: g++.dg/asan/pr118763.C -Os execution test on Linux/x86_64

2025-02-17 Thread Sam James
"haochen.jiang"  writes:

> On Linux/x86_64,
>
> e96e1bb69c7b46db18e747ee379a62681bc8c82d is the first bad commit
> commit e96e1bb69c7b46db18e747ee379a62681bc8c82d
> Author: Jason Merrill 
> Date:   Fri Feb 14 10:53:01 2025 +0100
>
> c++: extended temp cleanups [PR118856]
>
> caused
>
> FAIL: g++.dg/asan/pr118763.C   -O0  execution test
> FAIL: g++.dg/asan/pr118763.C   -O1  execution test
> FAIL: g++.dg/asan/pr118763.C   -Os  execution test
>
> with GCC configured with
>
> ../../gcc/configure
> --prefix=/export/users/haochenj/src/gcc-bisect/master/master/r15-7532/usr
> --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld
> --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet
> --without-isl --enable-libmpx x86_64-linux --disable-bootstrap

I see it too. I've filed PR118905. Thanks.


Ping: [PATCH] late-combine: Tighten register class check [PR108840]

2025-02-17 Thread Richard Sandiford
Ping

Richard Sandiford  writes:
> gcc.target/aarch64/pr108840.c has failed since r15-268-g9dbff9c05520
> (which means that I really ought to have looked at it earlier).
>
> The test wants us to fold an SImode AND into all shifts that use it.
> This is something that late-combine is supposed to do, but:
>
> (1) the pre-RA pass chickened out because of a register pressure check
>
> (2) the post-RA pass can't handle it, because the shift uses are in
> QImode and the sets are in SImode
>
> Both are things that would be good to fix.  But (1) is particularly
> silly.  The constraints on the shift have "rk" for the destination
> (so allowing the stack pointer) and "r" for the first source.
> Including the stack pointer made the destination seem more permissive
> than the source.
>
> The intention was instead to check whether there are any
> *allocatable* registers in the destination class that aren't
> present in the source.
>
> That's enough for all tests but the last one.  The last one still
> fails because combine merges the final shift with the move into
> the hard return register, giving an arithmetic instruction with
> a hard register destination.  Pre-RA late-combine currently punts
> on those, again due to register pressure concerns.  That too is
> something I'd like to relax, but not for GCC 15.  In the interim,
> the best thing seems to be to disable combine for the test.
>
> Boostrapped & regression-tested on aarch64-linux-gnu and
> x86_64-linux-gnu.  OK to install?
>
> Richard
>
>
> gcc/
>   PR rtl-optimization/108840
>   * late-combine.cc (late_combine::check_register_pressure):
>   Take only allocatable registers into account when checking
>   the permissiveness of register classes.
>
> gcc/testsuite/
>   PR rtl-optimization/108840
>   * gcc.target/aarch64/pr108840.c: Run at -O2 but disable combine.
> ---
>  gcc/late-combine.cc | 10 --
>  gcc/testsuite/gcc.target/aarch64/pr108840.c |  2 +-
>  2 files changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/late-combine.cc b/gcc/late-combine.cc
> index 1707ceebd5f..90d7ef09583 100644
> --- a/gcc/late-combine.cc
> +++ b/gcc/late-combine.cc
> @@ -552,8 +552,14 @@ late_combine::check_register_pressure (insn_info *insn, 
> rtx set)
> // Make sure that the source operand's class is at least as
> // permissive as the destination operand's class.
> auto src_class = alternative_class (alt, i);
> -   if (!reg_class_subset_p (dest_class, src_class))
> - return false;
> +   if (dest_class != src_class)
> + {
> +   auto extra_dest_regs = (reg_class_contents[dest_class]
> +   & ~reg_class_contents[src_class]
> +   & ~fixed_reg_set);
> +   if (!hard_reg_set_empty_p (extra_dest_regs))
> + return false;
> + }
>  
> // Make sure that the source operand occupies no more hard
> // registers than the destination operand.  This mostly matters
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr108840.c 
> b/gcc/testsuite/gcc.target/aarch64/pr108840.c
> index 804c1cd9156..7e1ea6fa4fe 100644
> --- a/gcc/testsuite/gcc.target/aarch64/pr108840.c
> +++ b/gcc/testsuite/gcc.target/aarch64/pr108840.c
> @@ -1,6 +1,6 @@
>  /* PR target/108840.  Check that the explicit &31 is eliminated.  */
>  /* { dg-do compile } */
> -/* { dg-options "-O" } */
> +/* { dg-options "-O2 -fno-tree-vectorize -fdisable-rtl-combine" } */
>  
>  int
>  foo (int x, int y)


Re: [PING][PATCH v2] libcpp: Fix incorrect line numbers in large files [PR108900]

2025-02-17 Thread Alexander Monakov
Hi,

you may want to Cc Lewis Hyatt on such patches (adding him now), I didn't touch
this area.

Alexander

On Mon, 17 Feb 2025, yash.shi...@windriver.com wrote:

> From: Yash Shinde 
> 
> This patch addresses an issue in the C preprocessor where incorrect line 
> number information is generated when processing
> files with a large number of lines. The problem arises from improper handling 
> of location intervals in the line map,
> particularly when locations exceed LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES.
> 
> By ensuring that the highest location is not decremented if it would move to 
> a different ordinary map, this fix resolves
> the line number discrepancies observed in certain test cases. This change 
> improves the accuracy of line number reporting,
> benefiting users relying on precise code coverage and debugging information.
> 
> Signed-off-by: Jeremy Bettis 
> Signed-off-by: Yash Shinde 
> ---
>  libcpp/files.cc | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/libcpp/files.cc b/libcpp/files.cc
> index 1ed19ca..3e6ca119ad5 100644
> --- a/libcpp/files.cc
> +++ b/libcpp/files.cc
> @@ -1046,6 +1046,14 @@ _cpp_stack_file (cpp_reader *pfile, _cpp_file *file, 
> include_type type,
>   && type < IT_DIRECTIVE_HWM
>   && (pfile->line_table->highest_location
>   != LINE_MAP_MAX_LOCATION - 1));
> +
> +  if (decrement && LINEMAPS_ORDINARY_USED (pfile->line_table))
> +{
> +  const line_map_ordinary *map = LINEMAPS_LAST_ORDINARY_MAP 
> (pfile->line_table);
> +  if (map && map->start_location == pfile->line_table->highest_location)
> + decrement = false;
> +}
> +
>if (decrement)
>  pfile->line_table->highest_location--;
>  
> 


Re: [PATCH v1] RISC-V: Make VXRM as global register [PR118103]

2025-02-17 Thread Richard Sandiford
"Li, Pan2"  writes:
> Thanks Jeff and Richard S.
>
> Not sure if I followed up the discussion correct, but this patch only try to 
> fix the vxrm insn
> deleted during late-combine (same scenario as frm) by adding it to 
> global_regs.
>
> If global_regs is not the right place according to the sematic of vxrm, we 
> may need other fix up to a point.
> AFAIK, the most difference between vxrm and frm may look like below, take rvv 
> intrinsic as example:
>
>   13   │ void vxrm ()
>   14   │ {
>   15   │   size_t vl = __riscv_vsetvl_e16m1 (N);
>   16   │   vuint16m1_t va = __riscv_vle16_v_u16m1 (a, vl);
>   17   │   vuint16m1_t vb = __riscv_vle16_v_u16m1 (b, vl);
>   18   │   vuint16m1_t vc = __riscv_vaaddu_vv_u16m1 (va, vb, 
> __RISCV_VXRM_RDN, vl);
>   19   │
>   20   │   __riscv_vse16_v_u16m1 (c, vc, vl);
>   21   │
>   22   │   call_external ();
>   23   │ }
>   24   │
>   25   │ void frm ()
>   26   │ {
>   27   │   size_t vl = __riscv_vsetvl_e16m1 (N);
>   28   │
>   29   │   vfloat16m1_t va = __riscv_vle16_v_f16m1(af, vl);
>   30   │   va = __riscv_vfnmadd_vv_f16m1_rm(va, va, va, __RISCV_FRM_RDN, vl);
>   31   │   __riscv_vse16_v_f16m1(bf, va, vl);
>   32   │
>   33   │   call_external ();
>   34   │ }
>
> With option "-march=rv64gcv_zvfh -O3"
>
>   10   │ vxrm:
>   11   │ csrwi   vxrm,2  // Just set rm directly
> ...
>   17   │ vle16.v v2,0(a4)
>   18   │ vle16.v v1,0(a3)
> ...
>   21   │ vaaddu.vv   v1,v1,v2
>   22   │ vse16.v v1,0(a4)
>   23   │ tailcall_external
>   28   │ frm:
>   29   │ frrma2// backup
>   30   │ fsrmi   2  // set rm
> ...
>   35   │ vle16.v v1,0(a3)
>   36   │ addia5,a5,%lo(bf)
>   37   │ vfnmadd.vv  v1,v1,v1
>   38   │ vse16.v v1,0(a5)
>   39   │ fsrma2   // restore
>   40   │ tailcall_external
>
> However, I would like to wait Jeff, or other RISC-V ports for a while before 
> any potential action to take.

The difference in the patch seems to be:

@@ -49,6 +49,7 @@
.type   main, @function
 main:
 .LFB2:
+   csrwi   vxrm,2
addisp,sp,-16
 .LCFI0:
sd  ra,8(sp)

giving:

main:
.LFB2:
csrwi   vxrm,2
addisp,sp,-16
.LCFI0:
sd  ra,8(sp)
.LCFI1:
callinitialize
lui a3,%hi(a)
lui a4,%hi(b)
vsetivlizero,4,e16,m1,ta,ma
addia4,a4,%lo(b)
addia3,a3,%lo(a)
vle16.v v2,0(a4)
vle16.v v1,0(a3)
lui a4,%hi(c)
addia4,a4,%lo(c)
li  a0,0
vaaddu.vv   v1,v1,v2
vse16.v v1,0(a4)
ld  ra,8(sp)
.LCFI2:
addisp,sp,16
.LCFI3:
jr  ra

But if VXRM is call-clobbered, shouldn't the csrwi be after the call
to initialize, rather than before it?

The problem seems to be that mode-switching overloads VXRM_MODE_NONE
to mean both "no requirement" and "unknown state".  So we have:

static int
singleton_vxrm_need (void)
{
  /* Only needed for vector code.  */
  if (!TARGET_VECTOR)
return VXRM_MODE_NONE;

and:

  if (vxrm_unknown_p (insn))
return VXRM_MODE_NONE;

This means that VXRM is assumed to be transparent in an instruction
that matches vxrm_unknown_p.  The pass then thinks that it can move
the initialisation of VXRM up through the call to initialize to the
head of the block, even though the call clobbers VXRM and the uses
are after the call.

For mode-switching to work properly when the mode is not always known,
there need to be different "neutral" and "unknown" states.  E.g. if the
current state is X:

   after (X, neutral) == X

but:

   after (X, unknown) == unknown

So it looks like the global_regs change is masking an incorrect placement
of the VXRM instructions.  If the call had been to some external function
that clobbers VXRM then (AIUI) the code after the patch would still be wrong.

I think there needs to be something like an VXRM_MODE_UNKNOWN.

Thanks,
Richard


Re: [PATCH v3] x86: Properly find the maximum stack slot alignment

2025-02-17 Thread Uros Bizjak
On Fri, Feb 14, 2025 at 2:11 PM Uros Bizjak  wrote:
>
> On Fri, Feb 14, 2025 at 4:56 AM H.J. Lu  wrote:
> >
> > On Thu, Feb 13, 2025 at 5:17 PM Uros Bizjak  wrote:
> > >
> > > On Thu, Feb 13, 2025 at 9:31 AM H.J. Lu  wrote:
> > > >
> > > > Don't assume that stack slots can only be accessed by stack or frame
> > > > registers.  We first find all registers defined by stack or frame
> > > > registers.  Then check memory accesses by such registers, including
> > > > stack and frame registers.
> > > >
> > > > gcc/
> > > >
> > > > PR target/109780
> > > > PR target/109093
> > > > * config/i386/i386.cc (ix86_update_stack_alignment): New.
> > > > (ix86_find_all_reg_use_1): Likewise.
> > > > (ix86_find_all_reg_use): Likewise.
> > > > (ix86_find_max_used_stack_alignment): Also check memory accesses
> > > > from registers defined by stack or frame registers.
> > > >
> > > > gcc/testsuite/
> > > >
> > > > PR target/109780
> > > > PR target/109093
> > > > * g++.target/i386/pr109780-1.C: New test.
> > > > * gcc.target/i386/pr109093-1.c: Likewise.
> > > > * gcc.target/i386/pr109780-1.c: Likewise.
> > > > * gcc.target/i386/pr109780-2.c: Likewise.
> > > > * gcc.target/i386/pr109780-3.c: Likewise.
> > >
> > > Some non-algorithmical changes below, otherwise LGTM. Please also get
> > > someone to review dataflow infrastructure usage, I am not well versed
> > > with it.
> > >
> > > +/* Helper function for ix86_find_all_reg_use.  */
> > > +
> > > +static void
> > > +ix86_find_all_reg_use_1 (rtx set, HARD_REG_SET &stack_slot_access,
> > > + auto_bitmap &worklist)
> > > +{
> > > +  rtx src = SET_SRC (set);
> > > +  if (MEM_P (src))
> > >
> > > Also reject assignment from CONST_SCALAR_INT?
> >
> > Done.
> >
> > > +return;
> > > +
> > > +  rtx dest = SET_DEST (set);
> > > +  if (!REG_P (dest))
> > > +return;
> > >
> > > Can we switch these two so the test for REG_P (dest) will be first? We
> > > are not interested in anything that doesn't assign to a register.
> >
> > Done.
> >
> > > +/* Find all registers defined with REG.  */
> > > +
> > > +static void
> > > +ix86_find_all_reg_use (HARD_REG_SET &stack_slot_access,
> > > +   unsigned int reg, auto_bitmap &worklist)
> > > +{
> > > +  for (df_ref ref = DF_REG_USE_CHAIN (reg);
> > > +   ref != NULL;
> > > +   ref = DF_REF_NEXT_REG (ref))
> > > +{
> > > +  if (DF_REF_IS_ARTIFICIAL (ref))
> > > +continue;
> > > +
> > > +  rtx_insn *insn = DF_REF_INSN (ref);
> > > +  if (!NONDEBUG_INSN_P (insn))
> > > +continue;
> > >
> > > Here we pass only NONJUMP_INSN_P (X) || JUMP_P (X) || CALL_P (X)
> > >
> > > +  if (CALL_P (insn) || JUMP_P (insn))
> > > +continue;
> > >
> > > And here remains only NONJUMP_INSN_P (X), so both above conditions
> > > could be substituted with:
> > >
> > > if (!NONJUMP_INSN_P (X))
> > >   continue;
> >
> > Done.
> >
> > > +
> > > +  rtx set = single_set (insn);
> > > +  if (set)
> > > +ix86_find_all_reg_use_1 (set, stack_slot_access, worklist);
> > > +
> > > +  rtx pat = PATTERN (insn);
> > > +  if (GET_CODE (pat) != PARALLEL)
> > > +continue;
> > > +
> > > +  for (int i = 0; i < XVECLEN (pat, 0); i++)
> > > +{
> > > +  rtx exp = XVECEXP (pat, 0, i);
> > > +  switch (GET_CODE (exp))
> > > +{
> > > +case ASM_OPERANDS:
> > > +case CLOBBER:
> > > +case PREFETCH:
> > > +case USE:
> > > +  break;
> > > +case UNSPEC:
> > > +case UNSPEC_VOLATILE:
> > > +  for (int j = XVECLEN (exp, 0) - 1; j >= 0; j--)
> > > +{
> > > +  rtx x = XVECEXP (exp, 0, j);
> > > +  if (GET_CODE (x) == SET)
> > > +ix86_find_all_reg_use_1 (x, stack_slot_access,
> > > + worklist);
> > > +}
> > > +  break;
> > > +case SET:
> > > +  ix86_find_all_reg_use_1 (exp, stack_slot_access,
> > > +   worklist);
> > > +  break;
> > > +default:
> > > +  debug_rtx (exp);
> > >
> > > Stray debug remaining?

Some more looking in the above loop... A comment in rtlanal.cc says that:

--q--
  /* All the other possibilities never store and can use a normal
 rtx walk.  This includes:

 - USE
 - TRAP_IF
 - PREFETCH
 - UNSPEC
 - UNSPEC_VOLATILE.  */
--/q--

So, based on the above, it is possible to considerably simplify the
loop above, all the way to:

--cut here--
  for (int i = 0; i < XVECLEN (pat, 0); i++)
{
  rtx exp = XVECEXP (pat, 0, i);

  if (GET_CODE (exp) == SET)
ix86_find_all_reg_use_1 (exp, stack_slot_access, worklist);
}
--cut here--

We are only interested in SETs that set a register from a stack/frame
pointer, not in their general uses.

Attached patch implements this simplification.

Uros.
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index fafd4a511a3..560e6525b56 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/

[PATCH v1] Vect: Fix ICE when get DImode from get_related_vectype_for_scalar_type [PR116351]

2025-02-17 Thread pan2 . li
From: Pan Li 

This patch would like to fix the ICE similar as below, assump we have
sample code:

   1   │ int a, b, c;
   2   │ short d, e, f;
   3   │ long g (long h) { return h; }
   4   │
   5   │ void i () {
   6   │   for (; b; ++b) {
   7   │ f = 5 >> a ? d : d << a;
   8   │ e &= c | g(f);
   9   │   }
  10   │ }

It will ice when compile with -O3 -march=rv64gc_zve64f -mrvv-vector-bits=zvl

during GIMPLE pass: vect
pr116351-1.c: In function ‘i’:
pr116351-1.c:8:6: internal compiler error: in get_len_load_store_mode,
at optabs-tree.cc:655
8 | void i () {
  |  ^
0x44d6b9d internal_error(char const*, ...)
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/diagnostic-global-context.cc:517
0x44a26a6 fancy_abort(char const*, int, char const*)

/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/diagnostic.cc:1722
0x19e4309 get_len_load_store_mode(machine_mode, bool, internal_fn*,
vec*)

/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/optabs-tree.cc:655
0x1fada40 vect_verify_loop_lens

/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:1566
0x1fb2b07 vect_analyze_loop_2
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3037
0x1fb4302 vect_analyze_loop_1

/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3478
0x1fb4e9a vect_analyze_loop(loop*, gimple*, vec_info_shared*)

/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3638
0x203c2dc try_vectorize_loop_1

/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1095
0x203c839 try_vectorize_loop

/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1212
0x203cb2c execute

The zve32x cannot have 64 elen, and then the get_related_vectype_for_scalar_type
will get DImode as vector_mode in loop_info.  After that the underlying
vect_analyze_xx will assert the mode is VECTOR and then ICE at the assert.

The fix contains 2 part, aka let the get_related_vectype_for_scalar_type
return NULL_TREE if mode_for_vector is not VECTOR mode in the middle-end,
and then mark the innermode of RVV is DImode is not support when the
TARGET_VECTOR_ELEN_64 is false.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

PR target/116351

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_vector_mode_supported_any_target_p): Mark
innnermode of RVV is DImode unsupported when zve32*.
* tree-vect-stmts.cc (get_related_vectype_for_scalar_type): Return
the NULL_TREE if mode_for_vector is not a VECTOR mode.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr116351-1.c: New test.
* gcc.target/riscv/rvv/base/pr116351-2.c: New test.
* gcc.target/riscv/rvv/base/pr116351.h: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv.cc  |  6 +-
 .../gcc.target/riscv/rvv/base/pr116351-1.c |  5 +
 .../gcc.target/riscv/rvv/base/pr116351-2.c |  5 +
 .../gcc.target/riscv/rvv/base/pr116351.h   | 18 ++
 gcc/tree-vect-stmts.cc |  9 +++--
 5 files changed, 40 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr116351-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr116351-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr116351.h

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 9bf7713139f..89b534ac88f 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -12613,10 +12613,14 @@ extract_base_offset_in_addr (rtx mem, rtx *base, rtx 
*offset)
 /* Implements target hook vector_mode_supported_any_target_p.  */
 
 static bool
-riscv_vector_mode_supported_any_target_p (machine_mode)
+riscv_vector_mode_supported_any_target_p (machine_mode mode)
 {
   if (TARGET_XTHEADVECTOR)
 return false;
+
+  if (GET_MODE_INNER (mode) == DImode && !TARGET_VECTOR_ELEN_64)
+return false;
+
   return true;
 }
 
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr116351-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr116351-1.c
new file mode 100644
index 000..f58fedfeaf1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr116351-1.c
@@ -0,0 +1,5 @@
+/* Test that we do not have ice when compile */
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zve32x -mabi=lp64d -O3 -ftree-vectorize" } */
+
+#include "pr116351.h"
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr116351-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr116351-2.c
new file mode 100644
index 000..e1f46b745e2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr116351-2.c
@@ -0,0 +1,5 @@
+/* Test that we do not have ice when compile */
+

Re: [to-be-committed][RISC-V][PR target/118248] Avoid bogus alloca call in RISC-V backend

2025-02-17 Thread Richard Biener
On Sun, Feb 16, 2025 at 4:38 PM Jeff Law  wrote:
>
> This is Jakub's patch and Ian's testcase for the slightly vexing fault
> building the D runtime with an s390x-x-riscv cross compiler.
>
> The core issue is we're allocating a vector to hold temporary registers
> unconditionally, including cases where the vector isn't needed because
> the loop isn't going to iterate.
>
> In the cases where the vector isn't needed the length is computed with
> an expression (x / y) - 1 where x / y will be zero.  The alloca(-1) on
> the s390 platform triggers a fault.  We haven't seen the fault with an
> x86 cross, but we can certainly see the bogus value being passed to
> alloca with a debugger.

I would expect alloca(-1) to trigger a fault with -fstack-clash-protection
even on x86, so we should indeed avoid doing this.

>
> Jakub patch just conditionalizes the whole block in a sensible way.  So
> it looks larger than it really is.  I thought it might be better to do a
> bit of manual CSE on this code to make it even more obvious, but I think
> we're ultimately OK here.
>
> Ian provided the testcase, collapsed down into equivalent C code.
> Again, it doesn't fault on an x86-x-riscv, but I can see the incorrect
> behavior with a debugger.
>
> And a shout-out to Stefan for providing a docker based reproducer, it
> really helped track this down.
>
> Waiting for the pre-commit tester to do its thing before committing.
>
> Jeff
>


Re: [PATCH] libgomp: avoid unused-variable-error when configured with CFLAGS=-DNDEBUG

2025-02-17 Thread Thomas Schwinge
Hi!

On 2025-02-16T05:25:35+, "shynur ."  wrote:
> (The *new* patch is attached.)
>
> Hi, Jakub and Thomas~  I found some problems when compiling GCC, and it turns
> out it was related to libgomp.
>
>   $ git clone ...
>   $ mkdir gcc-build
>   $ cd gcc-build
>
> If I configure GCC with
>
>   $ CC=gcc-14 CXX=g++-14 CFLAGS=-DNDEBUG ../gcc/configure 
> --enable-languages=c++ --disable-multilib --disable-checking 
> --disable-bootstrap --program-suffix=-test
>
> Then
>
>   $ make -j2
>
> It will fail because
>
>   ../../../gcc/libgomp/oacc-mem.c: In function ‘acc_unmap_data’:
>   ../../../gcc/libgomp/oacc-mem.c:483:8: error: unused variable 
> ‘is_tgt_unmapped’ [-Werror=unused-variable]
> 483 |   bool is_tgt_unmapped = gomp_remove_var (acc_dev, n);
> |^~~

Ah, this mis-feature (in my opinion) of 'assert', that for '-DNDEBUG' it
doesn't evaluate its argument...

> This patch applies `__attribute__((unused))` to these variables, eliminating
> the need for additional `-Wno` flag thus retaining the static checking
> capabilities provided by GCC.

That's OK, but two things needs changing, see below.

> Thanks for reviewing!

Thanks for improving GCC!  :-)

> From f23e1c52b34c403806f6a7c1b746a777c0fdf457 Mon Sep 17 00:00:00 2001
> From: shynur 
> Date: Sun, 16 Feb 2025 13:08:30 +0800
> Subject: [PATCH] Avoid unused-variable-error when configured with
>  'CFLAGS=-DNDEBUG'.

As part of the Git commit message, please include a ChangeLog update (see
 and 'git log').
Basically, 'contrib/gcc-changelog/git_check_commit.py --print-changelog'
needs to accept your commit.

> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -2092,13 +2092,13 @@ gomp_unmap_vars_internal (struct target_mem_desc 
> *tgt, bool do_copyfrom,
>   tgt->list[i].length);
>if (do_remove)
>   {
> -   struct target_mem_desc *k_tgt = k->tgt;
> -   bool is_tgt_unmapped = gomp_remove_var (devicep, k);
> +   bool is_tgt_unmapped __attribute__((unused))
> + = gomp_remove_var (devicep, k);
> /* It would be bad if TGT got unmapped while we're still iterating
>over its LIST_COUNT, and also expect to use it in the following
>code.  */
> assert (!is_tgt_unmapped
> -   || k_tgt != tgt);
> +   || k->tgt != tgt);
>   }
>  }

Please check: if I remember correctly, it's no longer valid to
dereference 'k->tgt' after 'gomp_remove_var (devicep, k);'?  (That's why
we preserve the former as 'k_tgt'.)


Grüße
 Thomas


Re: [PATCH v1] Vect: Fix ICE when get DImode from get_related_vectype_for_scalar_type [PR116351]

2025-02-17 Thread Richard Biener
On Mon, Feb 17, 2025 at 10:38 AM  wrote:
>
> From: Pan Li 
>
> This patch would like to fix the ICE similar as below, assump we have
> sample code:
>
>1   │ int a, b, c;
>2   │ short d, e, f;
>3   │ long g (long h) { return h; }
>4   │
>5   │ void i () {
>6   │   for (; b; ++b) {
>7   │ f = 5 >> a ? d : d << a;
>8   │ e &= c | g(f);
>9   │   }
>   10   │ }
>
> It will ice when compile with -O3 -march=rv64gc_zve64f -mrvv-vector-bits=zvl
>
> during GIMPLE pass: vect
> pr116351-1.c: In function ‘i’:
> pr116351-1.c:8:6: internal compiler error: in get_len_load_store_mode,
> at optabs-tree.cc:655
> 8 | void i () {
>   |  ^
> 0x44d6b9d internal_error(char const*, ...)
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/diagnostic-global-context.cc:517
> 0x44a26a6 fancy_abort(char const*, int, char const*)
> 
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/diagnostic.cc:1722
> 0x19e4309 get_len_load_store_mode(machine_mode, bool, internal_fn*,
> vec*)
> 
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/optabs-tree.cc:655
> 0x1fada40 vect_verify_loop_lens
> 
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:1566
> 0x1fb2b07 vect_analyze_loop_2
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3037
> 0x1fb4302 vect_analyze_loop_1
> 
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3478
> 0x1fb4e9a vect_analyze_loop(loop*, gimple*, vec_info_shared*)
> 
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3638
> 0x203c2dc try_vectorize_loop_1
> 
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1095
> 0x203c839 try_vectorize_loop
> 
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1212
> 0x203cb2c execute
>
> The zve32x cannot have 64 elen, and then the 
> get_related_vectype_for_scalar_type
> will get DImode as vector_mode in loop_info.  After that the underlying
> vect_analyze_xx will assert the mode is VECTOR and then ICE at the assert.
>
> The fix contains 2 part, aka let the get_related_vectype_for_scalar_type
> return NULL_TREE if mode_for_vector is not VECTOR mode in the middle-end,
> and then mark the innermode of RVV is DImode is not support when the
> TARGET_VECTOR_ELEN_64 is false.
>
> The below test suites are passed for this patch.
> * The rv64gcv fully regression test.
> * The x86 bootstrap test.
> * The x86 fully regression test.
>
> PR target/116351
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (riscv_vector_mode_supported_any_target_p): 
> Mark
> innnermode of RVV is DImode unsupported when zve32*.
> * tree-vect-stmts.cc (get_related_vectype_for_scalar_type): Return
> the NULL_TREE if mode_for_vector is not a VECTOR mode.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/pr116351-1.c: New test.
> * gcc.target/riscv/rvv/base/pr116351-2.c: New test.
> * gcc.target/riscv/rvv/base/pr116351.h: New test.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/config/riscv/riscv.cc  |  6 +-
>  .../gcc.target/riscv/rvv/base/pr116351-1.c |  5 +
>  .../gcc.target/riscv/rvv/base/pr116351-2.c |  5 +
>  .../gcc.target/riscv/rvv/base/pr116351.h   | 18 ++
>  gcc/tree-vect-stmts.cc |  9 +++--
>  5 files changed, 40 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr116351-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr116351-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr116351.h
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 9bf7713139f..89b534ac88f 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -12613,10 +12613,14 @@ extract_base_offset_in_addr (rtx mem, rtx *base, 
> rtx *offset)
>  /* Implements target hook vector_mode_supported_any_target_p.  */
>
>  static bool
> -riscv_vector_mode_supported_any_target_p (machine_mode)
> +riscv_vector_mode_supported_any_target_p (machine_mode mode)
>  {
>if (TARGET_XTHEADVECTOR)
>  return false;
> +
> +  if (GET_MODE_INNER (mode) == DImode && !TARGET_VECTOR_ELEN_64)
> +return false;
> +
>return true;
>  }
>
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr116351-1.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/pr116351-1.c
> new file mode 100644
> index 000..f58fedfeaf1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr116351-1.c
> @@ -0,0 +1,5 @@
> +/* Test that we do not have ice when compile */
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gc_zve32x -mabi=lp64d -O3 -ftree-vectorize" } */
> +
> +#include "pr116351.h"
> diff --git a/gcc/testsuite/g

RE: [PATCH v1] Vect: Fix ICE when get DImode from get_related_vectype_for_scalar_type [PR116351]

2025-02-17 Thread Li, Pan2
> But that's wrong - read the comment before the code.  We do support integer 
> mode
> "generic" vectorization just fine.  Iff there's anything to plug then
> it's how we end
> up thinking there's with_len support for DImode vectors.

I see, then we need another place to fix this, let me have a try.

Pan

-Original Message-
From: Richard Biener  
Sent: Monday, February 17, 2025 6:02 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; 
jeffreya...@gmail.com; rdapp@gmail.com
Subject: Re: [PATCH v1] Vect: Fix ICE when get DImode from 
get_related_vectype_for_scalar_type [PR116351]

On Mon, Feb 17, 2025 at 10:38 AM  wrote:
>
> From: Pan Li 
>
> This patch would like to fix the ICE similar as below, assump we have
> sample code:
>
>1   │ int a, b, c;
>2   │ short d, e, f;
>3   │ long g (long h) { return h; }
>4   │
>5   │ void i () {
>6   │   for (; b; ++b) {
>7   │ f = 5 >> a ? d : d << a;
>8   │ e &= c | g(f);
>9   │   }
>   10   │ }
>
> It will ice when compile with -O3 -march=rv64gc_zve64f -mrvv-vector-bits=zvl
>
> during GIMPLE pass: vect
> pr116351-1.c: In function ‘i’:
> pr116351-1.c:8:6: internal compiler error: in get_len_load_store_mode,
> at optabs-tree.cc:655
> 8 | void i () {
>   |  ^
> 0x44d6b9d internal_error(char const*, ...)
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/diagnostic-global-context.cc:517
> 0x44a26a6 fancy_abort(char const*, int, char const*)
> 
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/diagnostic.cc:1722
> 0x19e4309 get_len_load_store_mode(machine_mode, bool, internal_fn*,
> vec*)
> 
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/optabs-tree.cc:655
> 0x1fada40 vect_verify_loop_lens
> 
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:1566
> 0x1fb2b07 vect_analyze_loop_2
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3037
> 0x1fb4302 vect_analyze_loop_1
> 
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3478
> 0x1fb4e9a vect_analyze_loop(loop*, gimple*, vec_info_shared*)
> 
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3638
> 0x203c2dc try_vectorize_loop_1
> 
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1095
> 0x203c839 try_vectorize_loop
> 
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1212
> 0x203cb2c execute
>
> The zve32x cannot have 64 elen, and then the 
> get_related_vectype_for_scalar_type
> will get DImode as vector_mode in loop_info.  After that the underlying
> vect_analyze_xx will assert the mode is VECTOR and then ICE at the assert.
>
> The fix contains 2 part, aka let the get_related_vectype_for_scalar_type
> return NULL_TREE if mode_for_vector is not VECTOR mode in the middle-end,
> and then mark the innermode of RVV is DImode is not support when the
> TARGET_VECTOR_ELEN_64 is false.
>
> The below test suites are passed for this patch.
> * The rv64gcv fully regression test.
> * The x86 bootstrap test.
> * The x86 fully regression test.
>
> PR target/116351
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (riscv_vector_mode_supported_any_target_p): 
> Mark
> innnermode of RVV is DImode unsupported when zve32*.
> * tree-vect-stmts.cc (get_related_vectype_for_scalar_type): Return
> the NULL_TREE if mode_for_vector is not a VECTOR mode.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/pr116351-1.c: New test.
> * gcc.target/riscv/rvv/base/pr116351-2.c: New test.
> * gcc.target/riscv/rvv/base/pr116351.h: New test.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/config/riscv/riscv.cc  |  6 +-
>  .../gcc.target/riscv/rvv/base/pr116351-1.c |  5 +
>  .../gcc.target/riscv/rvv/base/pr116351-2.c |  5 +
>  .../gcc.target/riscv/rvv/base/pr116351.h   | 18 ++
>  gcc/tree-vect-stmts.cc |  9 +++--
>  5 files changed, 40 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr116351-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr116351-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr116351.h
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 9bf7713139f..89b534ac88f 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -12613,10 +12613,14 @@ extract_base_offset_in_addr (rtx mem, rtx *base, 
> rtx *offset)
>  /* Implements target hook vector_mode_supported_any_target_p.  */
>
>  static bool
> -riscv_vector_mode_supported_any_target_p (machine_mode)
> +riscv_vector_mode_supported_any_target_p (machine_mode mode)
>  {
>if (TARGET_XTHEADVECTOR)

Re: [PATCH] Simplify _Hashtable::_M_merge_multi

2025-02-17 Thread François Dumont


On 16/02/2025 23:14, Jonathan Wakely wrote:

On Sun, 16 Feb 2025 at 21:15, François Dumont  wrote:

Hi

A minor simplification.

libstdc++: Simplify _Hashtable::_M_merge_multi

When merging two hashtable instances of the same type we do not need
to go through _M_src_hash_code that also check for identical Hash functor
type.

But that check is very cheap, do we really gain much here?


No runtime gain no, just limit instantiations.



If we're getting rid of _M_src_hash_code in this overload of
_M_merge_multi, should we get rid of it in the other overload too,
since that's the only other place that uses _M_src_hash_code?


Seems complicated, so maybe we can avoid is_same_v and is_empty_v with 
this smaller patch ?


    libstdc++: [_Hashtable] Avoid is_same_v and is_empty_v instantiations

    There is no need to check for same Hash functor type and this type 
being stateless

    if there is no hash code cached to reuse.

    libstdc++-v3/ChangeLog:

    * include/bits/hashtable.h (_Hashtable<>::_M_src_hash_code):
    Add __hash_cached::value check.

Let me know if interested.

François
diff --git a/libstdc++-v3/include/bits/hashtable.h 
b/libstdc++-v3/include/bits/hashtable.h
index d6d76a743bb..b17a314a11d 100644
--- a/libstdc++-v3/include/bits/hashtable.h
+++ b/libstdc++-v3/include/bits/hashtable.h
@@ -1213,6 +1213,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_M_src_hash_code(const _H2&, const key_type& __k,
 const __node_value_type& __src_n) const
{
+ if constexpr (__hash_cached::value)
if constexpr (std::is_same_v<_H2, _Hash>)
  if constexpr (std::is_empty_v<_Hash>)
// If the node has a cached hash code, it's OK to use it.


Re: [PATCH v1] RISC-V: Make VXRM as global register [PR118103]

2025-02-17 Thread Richard Sandiford
Richard Sandiford  writes:
> The problem seems to be that mode-switching overloads VXRM_MODE_NONE
> to mean both "no requirement" and "unknown state".  So we have:
>
> static int
> singleton_vxrm_need (void)
> {
>   /* Only needed for vector code.  */
>   if (!TARGET_VECTOR)
> return VXRM_MODE_NONE;

This was a bad example, sorry.  What matters more is that non-vector
instructions are also VXRM_MODE_NONE.  Or more specifically:

>
> and:
>
>   if (vxrm_unknown_p (insn))
> return VXRM_MODE_NONE;
>
> This means that VXRM is assumed to be transparent in an instruction
> that matches vxrm_unknown_p.

...the function:

static int
riscv_vxrm_mode_after (rtx_insn *insn, int mode)
{
  if (vxrm_unknown_p (insn))
return VXRM_MODE_NONE;

  if (recog_memoized (insn) < 0)
return mode;

  if (reg_mentioned_p (gen_rtx_REG (SImode, VXRM_REGNUM), PATTERN (insn)))
return get_attr_vxrm_mode (insn);
  else
return mode;
}

will return VXRM_MODE_NONE if:

(a) insn is something like a call
(b) insn is a normal instruction that does not mention VXRM at all and
mode is already VXRM_MODE_NONE

(b) is the transparent case but (a) is a kill.  Since the block walk
starts with VXRM_MODE_NONE as the initial mode, there needs to be
another mode that (a) can use to indicate a kill.

Thanks,
Richard


  1   2   >