[PATCH] c++/modules: Don't emit imported deduction guides [PR117397]

2025-01-11 Thread Nathaniel Shead
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

-- >8 --

The ICE in the linked PR is caused because name lookup finds duplicate
copies of the deduction guides, causing a checking assert to fail.

This is ultimately because we're exporting an imported guide; when name
lookup processes 'dguide-5_b.H' it goes via the 'tt_entity' path and
just returns the entity from 'dguide-5_a.H'.  Because this doesn't ever
go through 'key_mergeable' we never set 'BINDING_VECTOR_GLOBAL_DUPS_P'
and so deduping is not engaged, allowing duplicate results.

Currently I believe this to be a perculiarity of the ANY_REACHABLE
handling for deduction guides; in no other case that I can find do we
emit bindings purely to imported entities.  As such, this patch fixes
this problem from that end, by ensuring that we simply do not emit any
imported deduction guides.  This avoids the ICE because no duplicates
need deduping to start with, and should otherwise have no functional
change because lookup of deduction guides will look at all reachable
modules (exported or not) regardless.

Since we're now deliberately not emitting imported deduction guides we
can use LOOK_want::NORMAL instead of LOOK_want::ANY_REACHABLE, since the
extra work to find as-yet undiscovered deduction guides in transitive
importers is not necessary here.

PR c++/117397

gcc/cp/ChangeLog:

* module.cc (depset::hash::add_deduction_guides): Don't emit
imported deduction guides.

gcc/testsuite/ChangeLog:

* g++.dg/modules/dguide-5_a.H: New test.
* g++.dg/modules/dguide-5_b.H: New test.
* g++.dg/modules/dguide-5_c.H: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/module.cc  | 30 +++
 gcc/testsuite/g++.dg/modules/dguide-5_a.H |  6 +
 gcc/testsuite/g++.dg/modules/dguide-5_b.H |  6 +
 gcc/testsuite/g++.dg/modules/dguide-5_c.H |  7 ++
 4 files changed, 39 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/dguide-5_a.H
 create mode 100644 gcc/testsuite/g++.dg/modules/dguide-5_b.H
 create mode 100644 gcc/testsuite/g++.dg/modules/dguide-5_c.H

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 51990f36284..c4df18b9ca9 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -14341,22 +14341,32 @@ depset::hash::add_deduction_guides (tree decl)
   if (find_binding (ns, name))
 return;
 
-  tree guides = lookup_qualified_name (ns, name, LOOK_want::ANY_REACHABLE,
+  tree guides = lookup_qualified_name (ns, name, LOOK_want::NORMAL,
   /*complain=*/false);
   if (guides == error_mark_node)
 return;
 
-  /* We have bindings to add.  */
-  depset *binding = make_binding (ns, name);
-  add_namespace_context (binding, ns);
+  depset *binding = nullptr;
+  for (tree t : lkp_range (guides))
+{
+  /* We don't want to export imported deduction guides, since searches will
+look there anyway.  */
+  if (DECL_LANG_SPECIFIC (STRIP_TEMPLATE (t))
+ && DECL_MODULE_IMPORT_P (STRIP_TEMPLATE (t)))
+   continue;
 
-  depset **slot = binding_slot (ns, name, /*insert=*/true);
-  *slot = binding;
+  if (!binding)
+   {
+ /* We have bindings to add.  */
+ binding = make_binding (ns, name);
+ add_namespace_context (binding, ns);
 
-  for (lkp_iterator it (guides); it; ++it)
-{
-  gcc_checking_assert (!TREE_VISITED (*it));
-  depset *dep = make_dependency (*it, EK_FOR_BINDING);
+ depset **slot = binding_slot (ns, name, /*insert=*/true);
+ *slot = binding;
+   }
+
+  gcc_checking_assert (!TREE_VISITED (t));
+  depset *dep = make_dependency (t, EK_FOR_BINDING);
   binding->deps.safe_push (dep);
   dep->deps.safe_push (binding);
 }
diff --git a/gcc/testsuite/g++.dg/modules/dguide-5_a.H 
b/gcc/testsuite/g++.dg/modules/dguide-5_a.H
new file mode 100644
index 000..42421059c7f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/dguide-5_a.H
@@ -0,0 +1,6 @@
+// PR c++/117397
+// { dg-additional-options "-fmodule-header" }
+// { dg-module-cmi {} }
+
+template  struct S;
+template  S(T) -> S;
diff --git a/gcc/testsuite/g++.dg/modules/dguide-5_b.H 
b/gcc/testsuite/g++.dg/modules/dguide-5_b.H
new file mode 100644
index 000..d31f24d54de
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/dguide-5_b.H
@@ -0,0 +1,6 @@
+// PR c++/117397
+// { dg-additional-options "-fmodule-header" }
+// { dg-module-cmi {} }
+
+import "dguide-5_a.H";
+template  struct S;
diff --git a/gcc/testsuite/g++.dg/modules/dguide-5_c.H 
b/gcc/testsuite/g++.dg/modules/dguide-5_c.H
new file mode 100644
index 000..a9bf2dd0583
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/dguide-5_c.H
@@ -0,0 +1,7 @@
+// PR c++/117397
+// { dg-additional-options "-fmodule-header" }
+// { dg-module-cmi {} }
+
+template  struct S;
+import "dguide-5_b.H";
+S foo();
-- 
2.47.0



Re: [PATCH] Fortran: Added support for locality specs in DO CONCURRENT (Fortran 2018/23)

2025-01-11 Thread Jerry D

On 1/7/25 12:06 PM, Jerry D wrote:

On 9/25/24 3:18 AM, Andre Vehreschild wrote:

Hi all,

I finally managed to apply the fixed patch. It still had some stray 
line break
so check_GNU_style.py wouldn't succeed. But with that fixed I agree to 
have

only some nonsense bickering of the script.

As to the patch (I have stripped large parts.):


diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 36ed8eeac2d..c6aefb81a73 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -3042,6 +3042,16 @@ enum gfc_exec_op
    EXEC_OMP_ERROR, EXEC_OMP_ALLOCATE, EXEC_OMP_ALLOCATORS
  };

+/* Enum Definition for locality types.  */
+enum locality_type
+{
+  LOCALITY_LOCAL = 0,
+  LOCALITY_LOCAL_INIT,
+  LOCALITY_SHARED,
+  LOCALITY_REDUCE,
+  LOCALITY_NUM
+};
+
  typedef struct gfc_code
  {
    gfc_exec_op op;
@@ -3089,7 +3099,15 @@ typedef struct gfc_code
  gfc_inquire *inquire;
  gfc_wait *wait;
  gfc_dt *dt;
-    gfc_forall_iterator *forall_iterator;
+
+    struct
+    {
+  gfc_forall_iterator *forall_iterator;
+  gfc_expr_list *locality[LOCALITY_NUM];
+  bool default_none;
+    }
+    concur;


I am more than unhappy about that construct. Because every concurrent 
loop has

a forall_iterator, but not every forall_iterator is a concurrent loop. I
therefore propose to move the forall_iterator out of the struct and 
only have
the concurrent specific elements in the struct. This would also reduce 
the

changes significantly.



Interestingly, simply moving the gfc_forall_iterator back to where it 
was before and changing all references to it to point to it I get a 
clean build of gfortran, but several of the testcases now fail with a 
segfault.


For example:

$ gfc -fcoarray=single do_concurrent_constraints.f90
f951: internal compiler error: Segmentation fault
0x22b9041 internal_error(char const*, ...)
 ../../trunk/gcc/diagnostic-global-context.cc:517
0xde4d6f crash_signal
 ../../trunk/gcc/toplev.cc:322
0x7053db parse_do_block
 ../../trunk/gcc/fortran/parse.cc:5414
0x7033c4 parse_executable
 ../../trunk/gcc/fortran/parse.cc:6396
0x7047ae parse_progunit
 ../../trunk/gcc/fortran/parse.cc:6803
0x704b58 parse_contained
 ../../trunk/gcc/fortran/parse.cc:6678
0x705b5c parse_module
 ../../trunk/gcc/fortran/parse.cc:7049
0x705f8c gfc_parse_file()
 ../../trunk/gcc/fortran/parse.cc:7351
0x75f69f gfc_be_parse_file
 ../../trunk/gcc/fortran/f95-lang.cc:241

In gdb it looks like the 'next' field in the iterator is pointing to 
garbage when it ought to be NULL.  I am looking around to see why that 
is not getting initialized correctly or maybe this has uncovered 
something more nasty.


Jerry



The attached patch is the latest clean build and test run I can come up 
with. I completely cannot understand why moving the forall_iterator from 
the sub-structure 'concur' back to where it was at the 'ext' 
sub-structure of typedef struct gfc_code. 'ext' is a union. I suspected 
there is an overlap going on there such that something is getting 
overwritten or optimized away. I am unable to find the culprit.


Regression tested on x86_64. OK for trunk?

I will make sure the Changelog stuff is squared away. I also think I 
will open a PR regarding the problem I described above.


Regards,

Jerrydiff --git a/gcc/fortran/dump-parse-tree.cc b/gcc/fortran/dump-parse-tree.cc
index 8d31ddfcffb..e97693d54d9 100644
--- a/gcc/fortran/dump-parse-tree.cc
+++ b/gcc/fortran/dump-parse-tree.cc
@@ -2899,7 +2899,7 @@ show_code_node (int level, gfc_code *c)
 
 case EXEC_FORALL:
   fputs ("FORALL ", dumpfile);
-  for (fa = c->ext.forall_iterator; fa; fa = fa->next)
+  for (fa = c->ext.concur.forall_iterator; fa; fa = fa->next)
 	{
 	  show_expr (fa->var);
 	  fputc (' ', dumpfile);
@@ -2959,7 +2959,7 @@ show_code_node (int level, gfc_code *c)
 
 case EXEC_DO_CONCURRENT:
   fputs ("DO CONCURRENT ", dumpfile);
-  for (fa = c->ext.forall_iterator; fa; fa = fa->next)
+  for (fa = c->ext.concur.forall_iterator; fa; fa = fa->next)
 {
   show_expr (fa->var);
   fputc (' ', dumpfile);
@@ -2972,7 +2972,114 @@ show_code_node (int level, gfc_code *c)
   if (fa->next != NULL)
 fputc (',', dumpfile);
 }
-  show_expr (c->expr1);
+
+  if (c->expr1 != NULL)
+	{
+	  fputc (',', dumpfile);
+	  show_expr (c->expr1);
+	}
+
+  if (c->ext.concur.locality[LOCALITY_LOCAL])
+	{
+	  fputs (" LOCAL (", dumpfile);
+
+	  for (gfc_expr_list *el = c->ext.concur.locality[LOCALITY_LOCAL];
+	   el; el = el->next)
+	{
+	  show_expr (el->expr);
+	  if (el->next)
+		fputc (',', dumpfile);
+	}
+	  fputc (')', dumpfile);
+	}
+
+  if (c->ext.concur.locality[LOCALITY_LOCAL_INIT])
+	{
+	  fputs (" LOCAL_INIT (", dumpfile);
+	  for (gfc_expr_list *el = c->ext.concur.locality[LOCALITY_LOCAL_INIT];
+	   el; el = el->next)
+	  {
+	show_expr (el->expr);
+	if (el->next)
+	  fputc (',', dump

[committed] testsuite: Fix flag used for modules test

2025-01-11 Thread Nathaniel Shead
Tested on x86_64-pc-linux-gnu, committing as obvious.

-- >8 --

GCC14 doesn't have the new spelling '-fmodules' for enabling modules;
use the old '-fmodules-ts' spelling instead.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr114630_a.C: Use -fmodules-ts instead of
-fmodules in testcase.
* g++.dg/modules/pr114630_b.C: Likewise.
* g++.dg/modules/pr114630_c.C: Likewise.

Signed-off-by: Nathaniel Shead 
---
 gcc/testsuite/g++.dg/modules/pr114630_a.C | 2 +-
 gcc/testsuite/g++.dg/modules/pr114630_b.C | 2 +-
 gcc/testsuite/g++.dg/modules/pr114630_c.C | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/g++.dg/modules/pr114630_a.C 
b/gcc/testsuite/g++.dg/modules/pr114630_a.C
index ecfd7ca0b28..dde8388e743 100644
--- a/gcc/testsuite/g++.dg/modules/pr114630_a.C
+++ b/gcc/testsuite/g++.dg/modules/pr114630_a.C
@@ -1,4 +1,4 @@
-// { dg-additional-options "-fmodules" }
+// { dg-additional-options "-fmodules-ts" }
 // { dg-module-cmi X }
 
 module;
diff --git a/gcc/testsuite/g++.dg/modules/pr114630_b.C 
b/gcc/testsuite/g++.dg/modules/pr114630_b.C
index 52fe04e2ce0..e41ddcdfb8e 100644
--- a/gcc/testsuite/g++.dg/modules/pr114630_b.C
+++ b/gcc/testsuite/g++.dg/modules/pr114630_b.C
@@ -1,4 +1,4 @@
-// { dg-additional-options "-fmodules" }
+// { dg-additional-options "-fmodules-ts" }
 // { dg-module-cmi Y }
 
 module;
diff --git a/gcc/testsuite/g++.dg/modules/pr114630_c.C 
b/gcc/testsuite/g++.dg/modules/pr114630_c.C
index 54a21f08057..61d89829588 100644
--- a/gcc/testsuite/g++.dg/modules/pr114630_c.C
+++ b/gcc/testsuite/g++.dg/modules/pr114630_c.C
@@ -1,4 +1,4 @@
-// { dg-additional-options "-fmodules" }
+// { dg-additional-options "-fmodules-ts" }
 
 #include "pr114630.h"
 import Y;
-- 
2.47.0



[PATCH,LRA] Restrict the reuse of spill slots [PR117868]

2025-01-11 Thread Denis Chertykov

The fix for PR117868.

In brief:
this is an LRA bug derived from reuse spilling slots after frame pointer 
spilling.
The slot was created for QImode (1 byte) and it was reused after spilling of the
frame pointer for TImode register (16 bytes long) and it overlaps other slots.

Wrong things happened while `lra_spill ()'
 part of lra-spills.cc 
  n = assign_spill_hard_regs (pseudo_regnos, n);
  slots_num = 0;
  assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n);  <--- first call 
---
  for (i = 0; i < n; i++)
if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
  assign_mem_slot (pseudo_regnos[i]);
  if ((n2 = lra_update_fp2sp_elimination (pseudo_regnos)) > 0)
{
  /* Assign stack slots to spilled pseudos assigned to fp.  */
  assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n2);  <--- second 
call ---
  for (i = 0; i < n2; i++)
if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
  assign_mem_slot (pseudo_regnos[i]);
}
--

In a first call of `assign_stack_slot_num_and_sort_pseudos(...)' LRA allocates 
slot #17
for r93 (QImode - 1 byte).
In a second call of `assign_stack_slot_num_and_sort_pseudos(...)' LRA reuse 
slot #17 for
r114 (TImode - 16 bytes).
It's wrong. We can't reuse 1 byte slot #17 for 16 bytes register.


Details:

After IRA pass we have:
-- part of simd-t.c.319r.ira --
(insn 269 268 270 8 (set (subreg:QI (reg:TI 114 [ _116 ]) 6)
(xor:QI (reg:QI 66 [ _36 ])
(reg:QI 67 [ _37 ]))) 631 {xorqi3}
 (expr_list:REG_DEAD (reg:QI 67 [ _37 ])
(expr_list:REG_DEAD (reg:QI 66 [ _36 ])
(nil
(insn 270 269 271 8 (set (subreg:QI (reg:TI 114 [ _116 ]) 7)
(xor:QI (reg:QI 69 [ _39 ])
(reg:QI 70 [ _40 ]))) 631 {xorqi3}
 (expr_list:REG_DEAD (reg:QI 70 [ _40 ])
(expr_list:REG_DEAD (reg:QI 69 [ _39 ])
(nil
---

While LRA spilling:
-- part of simd-t.c.320r.reload ---
  Creating newreg=348 from oldreg=66, assigning class GENERAL_REGS to r348
  269: r348:QI=r348:QI^r67:QI
  REG_DEAD r67:QI
  REG_DEAD r66:QI
Inserting insn reload before:
  543: r348:QI=r66:QI
Inserting insn reload after:
  544: r114:TI#6=r348:QI

[...]

  Choosing alt 0 in insn 270:  (0) =r  (1) %0  (2) r {xorqi3}
  Creating newreg=349 from oldreg=69, assigning class GENERAL_REGS to r349
  Creating newreg=350 from oldreg=70, assigning class GENERAL_REGS to r350
  270: r349:QI=r349:QI^r350:QI
  REG_DEAD r70:QI
  REG_DEAD r69:QI
Inserting insn reload before:
  545: r349:QI=r69:QI
  547: r350:QI=r70:QI
Inserting insn reload after:
  546: r114:TI#7=r349:QI
---


After LRA pass:
-- part of simd-t.c.320r.reload ---
(insn 543 542 269 11 (set (reg:QI 14 r14 [orig:66 _36 ] [66])
(mem/c:QI (plus:HI (reg/f:HI 28 r28)
(const_int 7 [0x7])) [4 %sfp+7 S1 A8])) 113 {movqi_insn_split}
 (nil))
(insn 269 543 544 11 (set (reg:QI 14 r14 [orig:66 _36 ] [66])
(xor:QI (reg:QI 14 r14 [orig:66 _36 ] [66])
(reg:QI 2 r2 [orig:67 _37 ] [67]))) 631 {xorqi3}
 (nil))
(insn 544 269 545 11 (set (mem/c:QI (plus:HI (reg/f:HI 28 r28)
(const_int 24 [0x18])) [4 %sfp+24 S1 A8])
(reg:QI 14 r14 [orig:66 _36 ] [66])) 113 {movqi_insn_split}
 (nil))
(insn 545 544 547 11 (set (reg:QI 14 r14 [orig:69 _39 ] [69])
(mem/c:QI (plus:HI (reg/f:HI 28 r28)
(const_int 24 [0x18])) [4 %sfp+24 S1 A8])) 113 
{movqi_insn_split}
 (nil))
(insn 547 545 270 11 (set (reg:QI 13 r13 [orig:70 _40 ] [70])
(mem/c:QI (plus:HI (reg/f:HI 28 r28)
(const_int 8 [0x8])) [4 %sfp+8 S1 A8])) 113 {movqi_insn_split}
 (nil))
(insn 270 547 546 11 (set (reg:QI 14 r14 [orig:69 _39 ] [69])
(xor:QI (reg:QI 14 r14 [orig:69 _39 ] [69])
(reg:QI 13 r13 [orig:70 _40 ] [70]))) 631 {xorqi3}
 (nil))
(insn 546 270 548 11 (set (mem/c:QI (plus:HI (reg/f:HI 28 r28)
(const_int 25 [0x19])) [4 %sfp+25 S1 A8])
(reg:QI 14 r14 [orig:69 _39 ] [69])) 113 {movqi_insn_split}
 (nil))
---

Simplified insns before LRA:
--
insn 269 r114.6  =  r66 ^ r67
insn 270 r114.7  =  r69 ^ r70
--

after LRA:
--
insn 543 r14 {r66} = [sfp+7]# reload insn
insn 269 r14 {r66} ^= r2 {r67}
insn 544 [sfp+24] = r14 {r66}   # reload insn -- bug is here !

insn 545 r14 {r69} = [sfp+24]   # reload insn
insn 547 r13 {r70} = [sfp+8]# reload insn
insn 270 r14 {r69} ^= r13 {r70}
insn 546 [sfp+25] = r

[PATCH] testsuite: The expect framework might introduce CR in output

2025-01-11 Thread Torbjörn SVENSSON
Ok for trunk and releases/gcc-14?

--

When running tests using the "sim" config, the command is launched in
non-readonly mode and the text retrieved from the expect command will
then replace all LF with CRLF. (The problem can be found in sim_load
where it calls remote_spawn without an input file).

libstdc++-v3/ChangeLog:

* 27_io/print/1.cc: Allow both LF and CRLF in test.

Signed-off-by: Torbjörn SVENSSON 
---
 libstdc++-v3/testsuite/27_io/print/1.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/testsuite/27_io/print/1.cc 
b/libstdc++-v3/testsuite/27_io/print/1.cc
index f6585d9880a..2a74e5002f4 100644
--- a/libstdc++-v3/testsuite/27_io/print/1.cc
+++ b/libstdc++-v3/testsuite/27_io/print/1.cc
@@ -18,7 +18,7 @@ void
 test_println_default()
 {
   std::println("I walk the line");
-  // { dg-output "I walk the line\n" }
+  // { dg-output "I walk the line\r?\n" }
 }
 
 void
-- 
2.25.1



[PATCH] c: improve UX for -Wincompatible-pointer-types and C23 [PR116871]

2025-01-11 Thread David Malcolm
PR c/116871 notes that our diagnostics about incompatible function types
could be improved.

In particular, for the case of migrating to C23 I'm seeing a lot of
build failures with signal handlers similar to this (simplified from
alsa-tools-1.2.11, envy24control/profiles.c; see rhbz#2336278):

typedef void (*__sighandler_t) (int);

extern __sighandler_t signal (int __sig, __sighandler_t __handler)
 __attribute__ ((__nothrow__ , __leaf__));

void new_process(void)
{
  void (*int_stat)();

  int_stat = signal(2,  ((__sighandler_t) 1));

  signal(2, int_stat);
}

With trunk, cc1 fails with this message:

t.c: In function 'new_process':
t.c:18:12: error: assignment to 'void (*)(void)' from incompatible pointer type 
'__sighandler_t' {aka 'void (*)(int)'} [-Wincompatible-pointer-types]
   18 |   int_stat = signal(2,  ((__sighandler_t) 1));
  |^
t.c:20:13: error: passing argument 2 of 'signal' from incompatible pointer type 
[-Wincompatible-pointer-types]
   20 |   signal(2, int_stat);
  | ^~~~
  | |
  | void (*)(void)
t.c:11:57: note: expected '__sighandler_t' {aka 'void (*)(int)'} but argument 
is of type 'void (*)(void)'
   11 | extern __sighandler_t signal (int __sig, __sighandler_t __handler)
  |  ~~~^

With this patch, cc1 emits:

t.c: In function 'new_process':
t.c:18:12: error: assignment to 'void (*)(void)' from incompatible pointer type 
'__sighandler_t' {aka 'void (*)(int)'} [-Wincompatible-pointer-types]
   18 |   int_stat = signal(2,  ((__sighandler_t) 1));
  |^
t.c:9:16: note: '__sighandler_t' declared here
9 | typedef void (*__sighandler_t) (int);
  |^~
cc1: note: the meaning of '()' in function declarations changed in C23 from 
'(int)' to '(void)'
t.c:20:13: error: passing argument 2 of 'signal' from incompatible pointer type 
[-Wincompatible-pointer-types]
   20 |   signal(2, int_stat);
  | ^~~~
  | |
  | void (*)(void)
t.c:11:57: note: expected '__sighandler_t' {aka 'void (*)(int)'} but argument 
is of type 'void (*)(void)'
   11 | extern __sighandler_t signal (int __sig, __sighandler_t __handler)
  |  ~~~^
t.c:9:16: note: '__sighandler_t' declared here
9 | typedef void (*__sighandler_t) (int);
  |^~

showing the location of the pertinent typedef ("__sighandler_t"), and
emitting the note
  "the meaning of '()' in function declarations changed in C23
  from '(int)' to '(void)'"

Another example, simplfied from a52dec-0.7.4: src/a52dec.c
(rhbz#2336013):

typedef void (*__sighandler_t) (int);

extern __sighandler_t signal (int __sig, __sighandler_t __handler)
 __attribute__ ((__nothrow__ , __leaf__));

/* Mismatching return type.  */
#define RETSIGTYPE int
static RETSIGTYPE signal_handler (int sig)
{
}

static void print_fps (int final)
{
  signal (42, signal_handler);
}

With trunk, cc1 emits:

t2.c: In function 'print_fps':
t2.c:22:15: error: passing argument 2 of 'signal' from incompatible pointer 
type [-Wincompatible-pointer-types]
   22 |   signal (42, signal_handler);
  |   ^~
  |   |
  |   int (*)(int)
t2.c:11:57: note: expected '__sighandler_t' {aka 'void (*)(int)'} but argument 
is of type 'int (*)(int)'
   11 | extern __sighandler_t signal (int __sig, __sighandler_t __handler)
  |  ~~~^

With this patch cc1 emits:

t2.c: In function 'print_fps':
t2.c:22:15: error: passing argument 2 of 'signal' from incompatible pointer 
type [-Wincompatible-pointer-types]
   22 |   signal (42, signal_handler);
  |   ^~
  |   |
  |   int (*)(int)
t2.c:11:57: note: expected '__sighandler_t' {aka 'void (*)(int)'} but argument 
is of type 'int (*)(int)'
   11 | extern __sighandler_t signal (int __sig, __sighandler_t __handler)
  |  ~~~^
t2.c:16:19: note: 'signal_handler' declared here
   16 | static RETSIGTYPE signal_handler (int sig)
  |   ^~
t2.c:9:16: note: '__sighandler_t' declared here
9 | typedef void (*__sighandler_t) (int);
  |^~

showing the location of the pertinent fndecl ("signal_handler"), and,
as before, the pertinent typedef.

The patch also updates the colorization in the messages to visually
link and contrast the different types and typedefs; I'm going to
upload screenshots to the bug.

My hope is that this make it easier for users to decipher build failures
seen with the new C23 default.

Further improvements could be made to colorization in
convert_for_assignment, and similar improvements to C++, but I'm punting
those to GCC 16.

Successful

Re: [PATCH] c: improve UX for -Wincompatible-pointer-types and C23 [PR116871]

2025-01-11 Thread David Malcolm
On Sat, 2025-01-11 at 13:55 -0500, David Malcolm wrote:
> PR c/116871 notes that our diagnostics about incompatible function
> types
> could be improved.
> 
> In particular, for the case of migrating to C23 I'm seeing a lot of
> build failures with signal handlers similar to this (simplified from
> alsa-tools-1.2.11, envy24control/profiles.c; see rhbz#2336278):
> 
> typedef void (*__sighandler_t) (int);
> 
> extern __sighandler_t signal (int __sig, __sighandler_t __handler)
>  __attribute__ ((__nothrow__ , __leaf__));
> 
> void new_process(void)
> {
>   void (*int_stat)();
> 
>   int_stat = signal(2,  ((__sighandler_t) 1));
> 
>   signal(2, int_stat);
> }
> 
> With trunk, cc1 fails with this message:
> 
> t.c: In function 'new_process':
> t.c:18:12: error: assignment to 'void (*)(void)' from incompatible
> pointer type '__sighandler_t' {aka 'void (*)(int)'} [-Wincompatible-
> pointer-types]
>    18 |   int_stat = signal(2,  ((__sighandler_t) 1));
>   |    ^
> t.c:20:13: error: passing argument 2 of 'signal' from incompatible
> pointer type [-Wincompatible-pointer-types]
>    20 |   signal(2, int_stat);
>   | ^~~~
>   | |
>   | void (*)(void)
> t.c:11:57: note: expected '__sighandler_t' {aka 'void (*)(int)'} but
> argument is of type 'void (*)(void)'
>    11 | extern __sighandler_t signal (int __sig, __sighandler_t
> __handler)
>   | 
> ~~~^
> 
> With this patch, cc1 emits:
> 
> t.c: In function 'new_process':
> t.c:18:12: error: assignment to 'void (*)(void)' from incompatible
> pointer type '__sighandler_t' {aka 'void (*)(int)'} [-Wincompatible-
> pointer-types]
>    18 |   int_stat = signal(2,  ((__sighandler_t) 1));
>   |    ^
> t.c:9:16: note: '__sighandler_t' declared here
>     9 | typedef void (*__sighandler_t) (int);
>   |    ^~
> cc1: note: the meaning of '()' in function declarations changed in
> C23 from '(int)' to '(void)'
> t.c:20:13: error: passing argument 2 of 'signal' from incompatible
> pointer type [-Wincompatible-pointer-types]
>    20 |   signal(2, int_stat);
>   | ^~~~
>   | |
>   | void (*)(void)
> t.c:11:57: note: expected '__sighandler_t' {aka 'void (*)(int)'} but
> argument is of type 'void (*)(void)'
>    11 | extern __sighandler_t signal (int __sig, __sighandler_t
> __handler)
>   | 
> ~~~^
> t.c:9:16: note: '__sighandler_t' declared here
>     9 | typedef void (*__sighandler_t) (int);
>   |    ^~
> 
> showing the location of the pertinent typedef ("__sighandler_t"), and
> emitting the note
>   "the meaning of '()' in function declarations changed in C23
>   from '(int)' to '(void)'"
> 
> Another example, simplfied from a52dec-0.7.4: src/a52dec.c
> (rhbz#2336013):
> 
> typedef void (*__sighandler_t) (int);
> 
> extern __sighandler_t signal (int __sig, __sighandler_t __handler)
>  __attribute__ ((__nothrow__ , __leaf__));
> 
> /* Mismatching return type.  */
> #define RETSIGTYPE int
> static RETSIGTYPE signal_handler (int sig)
> {
> }
> 
> static void print_fps (int final)
> {
>   signal (42, signal_handler);
> }
> 
> With trunk, cc1 emits:
> 
> t2.c: In function 'print_fps':
> t2.c:22:15: error: passing argument 2 of 'signal' from incompatible
> pointer type [-Wincompatible-pointer-types]
>    22 |   signal (42, signal_handler);
>   |   ^~
>   |   |
>   |   int (*)(int)
> t2.c:11:57: note: expected '__sighandler_t' {aka 'void (*)(int)'} but
> argument is of type 'int (*)(int)'
>    11 | extern __sighandler_t signal (int __sig, __sighandler_t
> __handler)
>   | 
> ~~~^
> 
> With this patch cc1 emits:
> 
> t2.c: In function 'print_fps':
> t2.c:22:15: error: passing argument 2 of 'signal' from incompatible
> pointer type [-Wincompatible-pointer-types]
>    22 |   signal (42, signal_handler);
>   |   ^~
>   |   |
>   |   int (*)(int)
> t2.c:11:57: note: expected '__sighandler_t' {aka 'void (*)(int)'} but
> argument is of type 'int (*)(int)'
>    11 | extern __sighandler_t signal (int __sig, __sighandler_t
> __handler)
>   | 
> ~~~^
> t2.c:16:19: note: 'signal_handler' declared here
>    16 | static RETSIGTYPE signal_handler (int sig)
>   |   ^~
> t2.c:9:16: note: '__sighandler_t' declared here
>     9 | typedef void (*__sighandler_t) (int);
>   |    ^~
> 
> showing the location of the pertinent fndecl ("signal_handler"), and,
> as before, the pertinent typedef.
> 
> The patch also updates the colorization in the messages to visually
> link and contrast the different 

Re: [PATCH] testsuite: The expect framework might introduce CR in output

2025-01-11 Thread Jonathan Wakely
On Sat, 11 Jan 2025, 18:31 Torbjörn SVENSSON, 
wrote:

> Ok for trunk and releases/gcc-14?
>

OK, thanks



> --
>
> When running tests using the "sim" config, the command is launched in
> non-readonly mode and the text retrieved from the expect command will
> then replace all LF with CRLF. (The problem can be found in sim_load
> where it calls remote_spawn without an input file).
>
> libstdc++-v3/ChangeLog:
>
> * 27_io/print/1.cc: Allow both LF and CRLF in test.
>
> Signed-off-by: Torbjörn SVENSSON 
> ---
>  libstdc++-v3/testsuite/27_io/print/1.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/libstdc++-v3/testsuite/27_io/print/1.cc
> b/libstdc++-v3/testsuite/27_io/print/1.cc
> index f6585d9880a..2a74e5002f4 100644
> --- a/libstdc++-v3/testsuite/27_io/print/1.cc
> +++ b/libstdc++-v3/testsuite/27_io/print/1.cc
> @@ -18,7 +18,7 @@ void
>  test_println_default()
>  {
>std::println("I walk the line");
> -  // { dg-output "I walk the line\n" }
> +  // { dg-output "I walk the line\r?\n" }
>  }
>
>  void
> --
> 2.25.1
>
>


[PATCH] testsuite: The expect framework might introduce CR in output

2025-01-11 Thread Torbjörn SVENSSON
Changes since v1:

- Found out that 27_io/print/3.cc has the same kind of issue.

Ok for trunk and releases/gcc-14?

--

When running tests using the "sim" config, the command is launched in
non-readonly mode and the text retrieved from the expect command will
then replace all LF with CRLF. (The problem can be found in sim_load
where it calls remote_spawn without an input file).

libstdc++-v3/ChangeLog:

* 27_io/print/1.cc: Allow both LF and CRLF in test.
* 27_io/print/3.cc: Likewise.

Signed-off-by: Torbjörn SVENSSON 
---
 libstdc++-v3/testsuite/27_io/print/1.cc | 2 +-
 libstdc++-v3/testsuite/27_io/print/3.cc | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/testsuite/27_io/print/1.cc 
b/libstdc++-v3/testsuite/27_io/print/1.cc
index f6585d9880a..2a74e5002f4 100644
--- a/libstdc++-v3/testsuite/27_io/print/1.cc
+++ b/libstdc++-v3/testsuite/27_io/print/1.cc
@@ -18,7 +18,7 @@ void
 test_println_default()
 {
   std::println("I walk the line");
-  // { dg-output "I walk the line\n" }
+  // { dg-output "I walk the line\r?\n" }
 }
 
 void
diff --git a/libstdc++-v3/testsuite/27_io/print/3.cc 
b/libstdc++-v3/testsuite/27_io/print/3.cc
index ffcf7337ce5..90c4cfb0d48 100644
--- a/libstdc++-v3/testsuite/27_io/print/3.cc
+++ b/libstdc++-v3/testsuite/27_io/print/3.cc
@@ -13,7 +13,7 @@ test_println_blank()
   std::print("1");
   std::println();
   std::println("2");
-  // { dg-output "1\n2" }
+  // { dg-output "1\r?\n2" }
 }
 
 void
-- 
2.25.1



Re: [PATCH] phiopt: Reject hot/cold predictors for early phiopt [PR117935]

2025-01-11 Thread Andrew Pinski
On Sat, Jan 11, 2025 at 2:10 AM Richard Biener
 wrote:
>
>
>
> > Am 11.01.2025 um 09:49 schrieb Andrew Pinski :
> >
> > In this case, early phiopt would get rid of the user provided predicator
> > for hot/cold as it would remove the basic blocks. The easiest and best 
> > option is
> > for early phi-opt don't do phi-opt if the middle basic-block(s) have either
> > a hot or cold predict statement. Then after inlining, jump threading will 
> > most likely
> > happen and that will keep around the predictor.
> >
> > Note this only needs to be done for match_simplify_replacement and not the 
> > other
> > phi-opt functions because currently only match_simplify_replacement is able 
> > to skip
> > middle bb with predicator statements in it.
> >
> > OK? Bootstrapped and tested on x86_64-linux-gnu.
>
> Hmm, I think we still want to allow abs/min/max replacements early I think.
>
> What kind of replacement is performed in the problematic case?

Replace `a ? true : false` with simple `a`. So should we just reject
that if early and the middle contains a hot/cold predictor?
That is the only case where jump threading will most likely happen
after inlining too.
I will work on that tomorrow.

>
> >
> >PR tree-optimization/117935
> >
> > gcc/ChangeLog:
> >
> >* tree-ssa-phiopt.cc (contains_hot_cold_predict): New function.
> >(match_simplify_replacement): Return early if early_p and one of
> >the middle bb(s) have a hot/cold predict statement.
> >
> > gcc/testsuite/ChangeLog:
> >
> >* gcc.dg/predict-24.c: New test.
> >
> > Signed-off-by: Andrew Pinski 
> > ---
> > gcc/testsuite/gcc.dg/predict-24.c | 24 +
> > gcc/tree-ssa-phiopt.cc| 35 +++
> > 2 files changed, 59 insertions(+)
> > create mode 100644 gcc/testsuite/gcc.dg/predict-24.c
> >
> > diff --git a/gcc/testsuite/gcc.dg/predict-24.c 
> > b/gcc/testsuite/gcc.dg/predict-24.c
> > new file mode 100644
> > index 000..b2c8a77c323
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/predict-24.c
> > @@ -0,0 +1,24 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
> > +/* PR tree-optimization/117935 */
> > +
> > +static inline bool has_value(bool b)
> > +{
> > +  if (b)
> > +  {
> > +[[gnu::hot, gnu::unused]] label1:
> > +return true;
> > +  }
> > +  else
> > +return false;
> > +}
> > +/* The hot label should last until it gets inlined into value_or and 
> > jump_threaded.  */
> > +int value_or(bool b, int def0, int def1)
> > +{
> > +  if (has_value(b))
> > +return def0;
> > +  else
> > +return def1;
> > +}
> > +/* { dg-final { scan-tree-dump-times "first match heuristics: 90.00%" 2 
> > "profile_estimate"} } */
> > +/* { dg-final { scan-tree-dump-times "hot label heuristics of edge 
> > \[0-9\]+->\[0-9]+: 90.00%" 2 "profile_estimate"} } */
> > diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
> > index 64d3ba9e160..8fed3eadbb0 100644
> > --- a/gcc/tree-ssa-phiopt.cc
> > +++ b/gcc/tree-ssa-phiopt.cc
> > @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
> > #include "tree-ssa-dce.h"
> > #include "calls.h"
> > #include "tree-ssa-loop-niter.h"
> > +#include "gimple-predict.h"
> >
> > /* Return the singleton PHI in the SEQ of PHIs for edges E0 and E1. */
> >
> > @@ -916,6 +917,27 @@ auto_flow_sensitive::~auto_flow_sensitive ()
> > p.second.restore (p.first);
> > }
> >
> > +/* Returns true if BB contains an user provided predictor
> > +   (PRED_HOT_LABEL/PRED_COLD_LABEL).  */
> > +
> > +static bool
> > +contains_hot_cold_predict (basic_block bb)
> > +{
> > +  gimple_stmt_iterator gsi;
> > +  gsi = gsi_start_nondebug_after_labels_bb (bb);
> > +  for (; !gsi_end_p (gsi); gsi_next_nondebug (&gsi))
> > +{
> > +  gimple *s = gsi_stmt (gsi);
> > +  if (gimple_code (s) != GIMPLE_PREDICT)
> > +continue;
> > +  auto predict = gimple_predict_predictor (s);
> > +  if (predict == PRED_HOT_LABEL
> > +  || predict == PRED_COLD_LABEL)
> > +return true;
> > +}
> > +  return false;
> > +}
> > +
> > /*  The function match_simplify_replacement does the main work of doing the
> > replacement using match and simplify.  Return true if the replacement 
> > is done.
> > Otherwise return false.
> > @@ -953,6 +975,19 @@ match_simplify_replacement (basic_block cond_bb, 
> > basic_block middle_bb,
> >  stmt_to_move_alt))
> > return false;
> >
> > +  /* For early phiopt, we don't want to lose user generated predictors,
> > + that is if either middle bb contains either a hot or cold label 
> > predict,
> > + the phiopt should be skipped. */
> > +  if (early_p)
> > +{
> > +  if (contains_hot_cold_predict (middle_bb))
> > +return false;
> > +  if (threeway_p
> > +  && middle_bb != middle_bb_alt
> > +  && contains_hot_cold_predict (middle_bb_alt))
> > +return false;
> > +}
> > +
> >   /* Do not make conditional undefs unconditional

Re: [PATCH] phiopt: Reject hot/cold predictors for early phiopt [PR117935]

2025-01-11 Thread Richard Biener



> Am 11.01.2025 um 11:29 schrieb Andrew Pinski :
> 
> On Sat, Jan 11, 2025 at 2:10 AM Richard Biener
>  wrote:
>> 
>> 
>> 
 Am 11.01.2025 um 09:49 schrieb Andrew Pinski :
>>> 
>>> In this case, early phiopt would get rid of the user provided predicator
>>> for hot/cold as it would remove the basic blocks. The easiest and best 
>>> option is
>>> for early phi-opt don't do phi-opt if the middle basic-block(s) have either
>>> a hot or cold predict statement. Then after inlining, jump threading will 
>>> most likely
>>> happen and that will keep around the predictor.
>>> 
>>> Note this only needs to be done for match_simplify_replacement and not the 
>>> other
>>> phi-opt functions because currently only match_simplify_replacement is able 
>>> to skip
>>> middle bb with predicator statements in it.
>>> 
>>> OK? Bootstrapped and tested on x86_64-linux-gnu.
>> 
>> Hmm, I think we still want to allow abs/min/max replacements early I think.
>> 
>> What kind of replacement is performed in the problematic case?
> 
> Replace `a ? true : false` with simple `a`. So should we just reject
> that if early and the middle contains a hot/cold predictor?
> That is the only case where jump threading will most likely happen
> after inlining too.
> I will work on that tomorrow.

Hmm, I see.  So this is effectively setting likeliness to value range parts.  I 
can see how this is indeed useful information we shoulda’t dispose of.  The 
question is whether we can heuristically decide if it’s useful from the PHI def 
uses?  I suppose your patch is OK as-is.

Richard 

> 
>> 
>>> 
>>>   PR tree-optimization/117935
>>> 
>>> gcc/ChangeLog:
>>> 
>>>   * tree-ssa-phiopt.cc (contains_hot_cold_predict): New function.
>>>   (match_simplify_replacement): Return early if early_p and one of
>>>   the middle bb(s) have a hot/cold predict statement.
>>> 
>>> gcc/testsuite/ChangeLog:
>>> 
>>>   * gcc.dg/predict-24.c: New test.
>>> 
>>> Signed-off-by: Andrew Pinski 
>>> ---
>>> gcc/testsuite/gcc.dg/predict-24.c | 24 +
>>> gcc/tree-ssa-phiopt.cc| 35 +++
>>> 2 files changed, 59 insertions(+)
>>> create mode 100644 gcc/testsuite/gcc.dg/predict-24.c
>>> 
>>> diff --git a/gcc/testsuite/gcc.dg/predict-24.c 
>>> b/gcc/testsuite/gcc.dg/predict-24.c
>>> new file mode 100644
>>> index 000..b2c8a77c323
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/predict-24.c
>>> @@ -0,0 +1,24 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
>>> +/* PR tree-optimization/117935 */
>>> +
>>> +static inline bool has_value(bool b)
>>> +{
>>> +  if (b)
>>> +  {
>>> +[[gnu::hot, gnu::unused]] label1:
>>> +return true;
>>> +  }
>>> +  else
>>> +return false;
>>> +}
>>> +/* The hot label should last until it gets inlined into value_or and 
>>> jump_threaded.  */
>>> +int value_or(bool b, int def0, int def1)
>>> +{
>>> +  if (has_value(b))
>>> +return def0;
>>> +  else
>>> +return def1;
>>> +}
>>> +/* { dg-final { scan-tree-dump-times "first match heuristics: 90.00%" 2 
>>> "profile_estimate"} } */
>>> +/* { dg-final { scan-tree-dump-times "hot label heuristics of edge 
>>> \[0-9\]+->\[0-9]+: 90.00%" 2 "profile_estimate"} } */
>>> diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
>>> index 64d3ba9e160..8fed3eadbb0 100644
>>> --- a/gcc/tree-ssa-phiopt.cc
>>> +++ b/gcc/tree-ssa-phiopt.cc
>>> @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
>>> #include "tree-ssa-dce.h"
>>> #include "calls.h"
>>> #include "tree-ssa-loop-niter.h"
>>> +#include "gimple-predict.h"
>>> 
>>> /* Return the singleton PHI in the SEQ of PHIs for edges E0 and E1. */
>>> 
>>> @@ -916,6 +917,27 @@ auto_flow_sensitive::~auto_flow_sensitive ()
>>>p.second.restore (p.first);
>>> }
>>> 
>>> +/* Returns true if BB contains an user provided predictor
>>> +   (PRED_HOT_LABEL/PRED_COLD_LABEL).  */
>>> +
>>> +static bool
>>> +contains_hot_cold_predict (basic_block bb)
>>> +{
>>> +  gimple_stmt_iterator gsi;
>>> +  gsi = gsi_start_nondebug_after_labels_bb (bb);
>>> +  for (; !gsi_end_p (gsi); gsi_next_nondebug (&gsi))
>>> +{
>>> +  gimple *s = gsi_stmt (gsi);
>>> +  if (gimple_code (s) != GIMPLE_PREDICT)
>>> +continue;
>>> +  auto predict = gimple_predict_predictor (s);
>>> +  if (predict == PRED_HOT_LABEL
>>> +  || predict == PRED_COLD_LABEL)
>>> +return true;
>>> +}
>>> +  return false;
>>> +}
>>> +
>>> /*  The function match_simplify_replacement does the main work of doing the
>>>replacement using match and simplify.  Return true if the replacement is 
>>> done.
>>>Otherwise return false.
>>> @@ -953,6 +975,19 @@ match_simplify_replacement (basic_block cond_bb, 
>>> basic_block middle_bb,
>>> stmt_to_move_alt))
>>>return false;
>>> 
>>> +  /* For early phiopt, we don't want to lose user generated predictors,
>>> + that is if either middle bb contains either a hot or cold lab

Re: [PATCH] rtl: Remove invalid compare simplification [PR117186]

2025-01-11 Thread Andreas Schwab
This breaks m68k:

$ ./xgcc -B./  -xc -nostdinc /dev/null -S -o /dev/null 
-fself-test=../../gcc/testsuite/selftests
../../gcc/simplify-rtx.cc:8600: test_comparisons: FAIL: ASSERT_RTX_EQ (val, 
folded)
  expected: (const_int -1 [0x])
  actual: (const_int 0 [0])
cc1: internal compiler error: in assert_rtx_eq_at, at selftest-rtl.cc:57
0x1d1d7bf internal_error(char const*, ...)
../../gcc/diagnostic-global-context.cc:517
0x685a13 fancy_abort(char const*, int, char const*)
../../gcc/diagnostic.cc:1722
0xe1d284 selftest::assert_rtx_eq_at(selftest::location const&, char const*, 
rtx_def*, rtx_def*)
../../gcc/selftest-rtl.cc:57
0xe3834c test_comparisons
../../gcc/simplify-rtx.cc:8600
0xe3834c test_scalar_ops
../../gcc/simplify-rtx.cc:8631
0xe3834c selftest::simplify_rtx_cc_tests()
../../gcc/simplify-rtx.cc:9206
0x1c19486 selftest::run_tests()
../../gcc/selftest-run-tests.cc:122
0xe66e2a toplev::run_self_tests()
../../gcc/toplev.cc:2267

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


[patch,avr] Try to work around PR118012 / PR118360

2025-01-11 Thread Georg-Johann Lay

The patch below is for trunk.

Andrew Pinski says he has a patch to fix it, bit that won't materialize
before v16.

AVR: PR118012 - Try to work around sick code from match.pd.

This patch tries to work around PR118012 which may use a
full fledged multiplication instead of a simple bit test.
This is because match.pd's

/* (zero_one == 0) ? y : z  y -> ((typeof(y))zero_one * z)  y */
/* (zero_one != 0) ? z  y : y -> ((typeof(y))zero_one * z)  y */

"optimizes" code with op in { plus, ior, xor } like

  if (a & 1)
b = b  c;

to something like:

  x1 = EXTRACT_BIT0 (a);
  x2 = c MULT x1;
  b = b  x2;

or

  x1 = EXTRACT_BIT0 (a);
  x2 = ZERO_EXTEND (x1);
  x3 = NEG x2;
  x4 = a AND x3:
  b = b  x4;

which is very expensive and may even result in a libgcc call for
a 32-bit multiplication on devices that don't even have MUL.
Notice that EXTRACT_BIT0 is already more expensive (slower, more
code, more register pressure) than a bit-test + branch.

The patch:

o Adds some combiner patterns that try to map sick code back
  to a bit test + branch.

o Adjusts costs to make MULT (x AND 1) cheap, in the hope that the
  middle-end will use that alternative (which we map to sane code).

o On devices without MUL, 32-bit multiplication was performed by a
  library call, which bypasses the MULT (x AND 1) and similar patterns.
  Therefore, mulsi3 is also allowed for devices without MUL so that
  we get at MULT pattern that can be transformed.  (Though this is
  not possible on AVR_TINY since it passes arguments on the stack).

o Add a new command line option -mpr118012, so most of the patterns
  and cost computations can be switched off as they have
  avropt_pr118012 in their insn condition.

o Added sign-extract.0 patterns unconditionally (no avropt_pr118012).

Notice that this patch is just a work-around, it's not a fix of the
root cause, which are the patterns in match.pd that don't care about
the target and don't even care about costs.

The work-around is incomplete, and 3 of the new tests are still failing.
This is because there are situations where it does not work:

* The MULT is realized as a library call.

* The MULT is realized as an ASHIFT, and the ASHIFT again is transformed
  into something else.  For example, with -O2 -mmcu=atmega128,
  ASHIFT(3) is transformed into ASHIFT(1) + ASHIFT(2).


PR tree-optimization/118012
PR tree-optimization/118360
gcc/
* config/avr/avr.opt (-mpr118012): New undocumented option.
* config/avr/avr-protos.h (avr_out_sextr)
(avr_emit_skip_pixop, avr_emit_skip_clear): New protos.
* config/avr/avr.cc (avr_adjust_insn_length)
[case ADJUST_LEN_SEXTR]: Handle case.
(avr_rtx_costs_1) [NEG]: Costs for NEG (ZERO_EXTEND (ZERO_EXTRACT)).
[MULT && avropt_pr118012]: Costs for MULT (x AND 1).
(avr_out_sextr, avr_emit_skip_pixop, avr_emit_skip_clear): New
functions.
* config/avr/avr.md [avropt_pr118012]: Add combine patterns with
that condition that try to work around PR118012.
(adjust_len) : Add insn attr value.
(pixop): New code iterator.
(mulsi3) [avropt_pr118012 && !AVR_TINY]: Allow these in insn condition.
gcc/testsuite/
* gcc.target/avr/mmcu/pr118012-1.h: New file.
* gcc.target/avr/mmcu/pr118012-1-o2-m128.c: New test.
* gcc.target/avr/mmcu/pr118012-1-os-m128.c: New test.
* gcc.target/avr/mmcu/pr118012-1-o2-m103.c: New test.
* gcc.target/avr/mmcu/pr118012-1-os-m103.c: New test.
* gcc.target/avr/mmcu/pr118012-1-o2-t40.c: New test.
* gcc.target/avr/mmcu/pr118012-1-os-t40.c: New test.
* gcc.target/avr/mmcu/pr118360-1.h: New file.
* gcc.target/avr/mmcu/pr118360-1-o2-m128.c: New test.
* gcc.target/avr/mmcu/pr118360-1-os-m128.c: New test.
* gcc.target/avr/mmcu/pr118360-1-o2-m103.c: New test.
* gcc.target/avr/mmcu/pr118360-1-os-m103.c: New test.
* gcc.target/avr/mmcu/pr118360-1-o2-t40.c: New test.
* gcc.target/avr/mmcu/pr118360-1-os-t40.c: New test.diff --git a/gcc/config/avr/avr-protos.h b/gcc/config/avr/avr-protos.h
index b93f2b37a72..6fb7c41e1da 100644
--- a/gcc/config/avr/avr-protos.h
+++ b/gcc/config/avr/avr-protos.h
@@ -69,6 +69,7 @@ extern const char *avr_out_insert_notbit (rtx_insn *, rtx*, int*);
 extern const char *avr_out_insv (rtx_insn *, rtx*, int*);
 extern const char *avr_out_extr (rtx_insn *, rtx*, int*);
 extern const char *avr_out_extr_not (rtx_insn *, rtx*, int*);
+extern const char *avr_out_sextr (rtx_insn *, rtx*, int*);
 extern const char *avr_out_plus_set_ZN (rtx*, int*);
 extern const char *avr_out_plus_set_N (rtx*, int*);
 extern const char *avr_out_op8_set_ZN (rtx_code, rtx*, int*);
@@ -102,6 +103,8 @@ extern void avr_expand_prologue (void);
 extern void avr_expand_epilogue (bool);
 extern bool avr_emit_cpymemhi (rtx*);
 extern void avr_emit_xior_with_shift (rtx_insn*, rtx*, int);
+extern void avr_emit_skip_pixop (rtx_code, rtx, rtx, rt

Re: [PATCH] final: Fix get_attr_length for asm goto [PR118411]

2025-01-11 Thread Jeff Law




On 1/11/25 2:52 AM, Andrew Pinski wrote:

The problem is for inline-asm goto, the outer rtl insn type
is a jump_insn and get_attr_length does not handle ASM specially
unlike if the outer rtl insn type was just insn.

This fixes the issue by adding support for both CALL_INSN and JUMP_INSN
with asm.

OK? Bootstrapped and tested on x86_64-linux-gnu.

PR middle-end/118411

gcc/ChangeLog:

* final.cc (get_attr_length_1): Handle asm for CALL_INSN
and JUMP_INSNs.

OK
jeff



[r14-11203 Regression] FAIL: g++.dg/modules/pr114630_a.C -std=c++2b (test for excess errors) on Linux/x86_64

2025-01-11 Thread haochen.jiang
On Linux/x86_64,

7c013a681cc138b8f191b75e408fe322d7fd998c is the first bad commit
commit 7c013a681cc138b8f191b75e408fe322d7fd998c
Author: Nathaniel Shead 
Date:   Fri Jan 10 01:06:37 2025 +1100

c++/modules: Handle chaining already-imported local types [PR114630]

caused

FAIL: g++.dg/modules/pr114630_a.C module-cmi X (gcm.cache/X.gcm)
FAIL: g++.dg/modules/pr114630_a.C -std=c++17 (test for excess errors)
FAIL: g++.dg/modules/pr114630_a.C -std=c++2a (test for excess errors)
FAIL: g++.dg/modules/pr114630_a.C -std=c++2b (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/export/users/haochenj/src/gcc-bisect/gcc-14/releases/gcc-14/r14-11203/usr
 --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="modules.exp=g++.dg/modules/pr114630_a.C 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="modules.exp=g++.dg/modules/pr114630_a.C 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="modules.exp=g++.dg/modules/pr114630_a.C 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="modules.exp=g++.dg/modules/pr114630_a.C 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at haochen dot jiang at intel.com.)
(If you met problems with cascadelake related, disabling AVX512F in command 
line might save that.)
(However, please make sure that there is no potential problems with AVX512.)


Re: [PATCH] final: Fix get_attr_length for asm goto [PR118411]

2025-01-11 Thread Jeff Law




On 1/11/25 2:20 PM, Jakub Jelinek wrote:

On Sat, Jan 11, 2025 at 01:52:59AM -0800, Andrew Pinski wrote:

The problem is for inline-asm goto, the outer rtl insn type
is a jump_insn and get_attr_length does not handle ASM specially
unlike if the outer rtl insn type was just insn.

This fixes the issue by adding support for both CALL_INSN and JUMP_INSN
with asm.

OK? Bootstrapped and tested on x86_64-linux-gnu.

PR middle-end/118411

gcc/ChangeLog:

* final.cc (get_attr_length_1): Handle asm for CALL_INSN
and JUMP_INSNs.

Signed-off-by: Andrew Pinski 
---
  gcc/final.cc | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/final.cc b/gcc/final.cc
index 19c5d390c78..12c6eb0ac09 100644
--- a/gcc/final.cc
+++ b/gcc/final.cc
@@ -363,7 +363,11 @@ get_attr_length_1 (rtx_insn *insn, int (*fallback_fn) 
(rtx_insn *))
  
case CALL_INSN:

case JUMP_INSN:
-   length = fallback_fn (insn);
+   body = PATTERN (insn);
+   if (GET_CODE (body) == ASM_INPUT || asm_noperands (body) >= 0)
+ length = asm_insn_count (body) * fallback_fn (insn);
+   else
+ length = fallback_fn (insn);
break;


A CALL_INSN can't be ever an inline asm, and ASM_INPUT (i.e. asm
("something");) can't be even JUMP_INSN.
So, shouldn't this be instead
   case JUMP_INSN:
if (asm_noperands (body) >= 0)
  {
length = asm_insn_count (body) * fallback_fn (insn);
break;
  }
/* FALLTHRU */
   case CALL_INSN:
length = fallback_fn (insn);
break;
?
Yea, but we never thought we'd have JUMP_INSNs as ASMs either  I 
don't think Andrew's approach will cause any problems and does future 
proof this tiny bit of code.


jeff



[PATCH] final: Fix get_attr_length for asm goto [PR118411]

2025-01-11 Thread Andrew Pinski
The problem is for inline-asm goto, the outer rtl insn type
is a jump_insn and get_attr_length does not handle ASM specially
unlike if the outer rtl insn type was just insn.

This fixes the issue by adding support for both CALL_INSN and JUMP_INSN
with asm.

OK? Bootstrapped and tested on x86_64-linux-gnu.

PR middle-end/118411

gcc/ChangeLog:

* final.cc (get_attr_length_1): Handle asm for CALL_INSN
and JUMP_INSNs.

Signed-off-by: Andrew Pinski 
---
 gcc/final.cc | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/final.cc b/gcc/final.cc
index 19c5d390c78..12c6eb0ac09 100644
--- a/gcc/final.cc
+++ b/gcc/final.cc
@@ -363,7 +363,11 @@ get_attr_length_1 (rtx_insn *insn, int (*fallback_fn) 
(rtx_insn *))
 
   case CALL_INSN:
   case JUMP_INSN:
-   length = fallback_fn (insn);
+   body = PATTERN (insn);
+   if (GET_CODE (body) == ASM_INPUT || asm_noperands (body) >= 0)
+ length = asm_insn_count (body) * fallback_fn (insn);
+   else
+ length = fallback_fn (insn);
break;
 
   case INSN:
-- 
2.43.0



Re: [PATCH] phiopt: Reject hot/cold predictors for early phiopt [PR117935]

2025-01-11 Thread Richard Biener



> Am 11.01.2025 um 09:49 schrieb Andrew Pinski :
> 
> In this case, early phiopt would get rid of the user provided predicator
> for hot/cold as it would remove the basic blocks. The easiest and best option 
> is
> for early phi-opt don't do phi-opt if the middle basic-block(s) have either
> a hot or cold predict statement. Then after inlining, jump threading will 
> most likely
> happen and that will keep around the predictor.
> 
> Note this only needs to be done for match_simplify_replacement and not the 
> other
> phi-opt functions because currently only match_simplify_replacement is able 
> to skip
> middle bb with predicator statements in it.
> 
> OK? Bootstrapped and tested on x86_64-linux-gnu.

Hmm, I think we still want to allow abs/min/max replacements early I think.

What kind of replacement is performed in the problematic case?

> 
>PR tree-optimization/117935
> 
> gcc/ChangeLog:
> 
>* tree-ssa-phiopt.cc (contains_hot_cold_predict): New function.
>(match_simplify_replacement): Return early if early_p and one of
>the middle bb(s) have a hot/cold predict statement.
> 
> gcc/testsuite/ChangeLog:
> 
>* gcc.dg/predict-24.c: New test.
> 
> Signed-off-by: Andrew Pinski 
> ---
> gcc/testsuite/gcc.dg/predict-24.c | 24 +
> gcc/tree-ssa-phiopt.cc| 35 +++
> 2 files changed, 59 insertions(+)
> create mode 100644 gcc/testsuite/gcc.dg/predict-24.c
> 
> diff --git a/gcc/testsuite/gcc.dg/predict-24.c 
> b/gcc/testsuite/gcc.dg/predict-24.c
> new file mode 100644
> index 000..b2c8a77c323
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/predict-24.c
> @@ -0,0 +1,24 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
> +/* PR tree-optimization/117935 */
> +
> +static inline bool has_value(bool b)
> +{
> +  if (b)
> +  {
> +[[gnu::hot, gnu::unused]] label1:
> +return true;
> +  }
> +  else
> +return false;
> +}
> +/* The hot label should last until it gets inlined into value_or and 
> jump_threaded.  */
> +int value_or(bool b, int def0, int def1)
> +{
> +  if (has_value(b))
> +return def0;
> +  else
> +return def1;
> +}
> +/* { dg-final { scan-tree-dump-times "first match heuristics: 90.00%" 2 
> "profile_estimate"} } */
> +/* { dg-final { scan-tree-dump-times "hot label heuristics of edge 
> \[0-9\]+->\[0-9]+: 90.00%" 2 "profile_estimate"} } */
> diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
> index 64d3ba9e160..8fed3eadbb0 100644
> --- a/gcc/tree-ssa-phiopt.cc
> +++ b/gcc/tree-ssa-phiopt.cc
> @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
> #include "tree-ssa-dce.h"
> #include "calls.h"
> #include "tree-ssa-loop-niter.h"
> +#include "gimple-predict.h"
> 
> /* Return the singleton PHI in the SEQ of PHIs for edges E0 and E1. */
> 
> @@ -916,6 +917,27 @@ auto_flow_sensitive::~auto_flow_sensitive ()
> p.second.restore (p.first);
> }
> 
> +/* Returns true if BB contains an user provided predictor
> +   (PRED_HOT_LABEL/PRED_COLD_LABEL).  */
> +
> +static bool
> +contains_hot_cold_predict (basic_block bb)
> +{
> +  gimple_stmt_iterator gsi;
> +  gsi = gsi_start_nondebug_after_labels_bb (bb);
> +  for (; !gsi_end_p (gsi); gsi_next_nondebug (&gsi))
> +{
> +  gimple *s = gsi_stmt (gsi);
> +  if (gimple_code (s) != GIMPLE_PREDICT)
> +continue;
> +  auto predict = gimple_predict_predictor (s);
> +  if (predict == PRED_HOT_LABEL
> +  || predict == PRED_COLD_LABEL)
> +return true;
> +}
> +  return false;
> +}
> +
> /*  The function match_simplify_replacement does the main work of doing the
> replacement using match and simplify.  Return true if the replacement is 
> done.
> Otherwise return false.
> @@ -953,6 +975,19 @@ match_simplify_replacement (basic_block cond_bb, 
> basic_block middle_bb,
>  stmt_to_move_alt))
> return false;
> 
> +  /* For early phiopt, we don't want to lose user generated predictors,
> + that is if either middle bb contains either a hot or cold label predict,
> + the phiopt should be skipped. */
> +  if (early_p)
> +{
> +  if (contains_hot_cold_predict (middle_bb))
> +return false;
> +  if (threeway_p
> +  && middle_bb != middle_bb_alt
> +  && contains_hot_cold_predict (middle_bb_alt))
> +return false;
> +}
> +
>   /* Do not make conditional undefs unconditional.  */
>   if ((TREE_CODE (arg0) == SSA_NAME
>&& ssa_name_maybe_undef_p (arg0))
> --
> 2.43.0
> 


[PATCH] aarch64: Provide initial specifications for Apple CPU cores.

2025-01-11 Thread Iain Sandoe
Hi,

I originally made this patch for the Darwin Arm64 development branch,
however in discussions on IRC, it seems that it is also relevant to
Linux - since there are implementations running on Apple hardware with
the M1..3 CPUs.  It might also be helpful to the resolution of
PR113257 - although it is not a solution on its own.

Bootstrapped and tested manually (that it gives the expected .arch lines)
on aarch64-linux.

OK for trunk?
thanks
Iain

--- 8< ---

This covers the M1-M3 cores used in Apple desktop hardware that is also
sometimes used with Linux as the OS.

It does not cover the wider range that might be used in iOS and other
embedded platform versions.

Some of the content is estimates/best guesses - based on the following
public sources of information:
 * XNU (only for the Apple Implementer ID)
 * sysctl -a | grep hw on various M1, M2 and machines
 * AArch64.td from the Apple Open Source repo for LLVM.
 * What XCode-14 clang passes to cc1.

Unfortunately, these sources are in conflict; in particular the clang-claimed
feature set disagrees with the output of sysctl -a, and the base Arm revs.
claimed in some cases miss features that ARM DDI 0487J.a lists as mandatory
for the rev.

This latter point might not be actually significant - but for the sake of
caution I've made the spec use the lower arch rev + the additional features
that are consistently claimed by both sysctl and clang.

GCC does not seem to have a scheduler that is similar to the "Cyclone" one
in LLVM - so I've guessed to use cortex57 (but, maybe we miss 8-issue, it's
not clear - and my experience with the scheduler is ≈ 0).

Likewise we do not (yet) have specific cost models, so choose the generic
Armv8 one.

Thus, the choices here are intended to be conservative.

 * Currently, we do not seem to have any way to specify that M2/M3 has support
  for FEAT_BTI, but because of missing feaures is not compliant with the Arm
  base rev that implies this.
 * Proper version numbers are not readily available.
 * Since we have FIRESTORM/ICESTORM and similar pairs for the performance and
   efficiency cores on various machines, perhaps we should be using a big.LITTLE
   configuration; OTOH currently, I have no idea if that is usable in any way
   with the hardware as configured.

gcc/ChangeLog:

* config/aarch64/aarch64-cores.def (AARCH64_CORE): Add apple-a12,
apple-m1, apple-m2, apple-m3.
* config/aarch64/aarch64-tune.md: Regenerate.

Signed-off-by: Iain Sandoe 
---
 gcc/config/aarch64/aarch64-cores.def | 12 
 gcc/config/aarch64/aarch64-tune.md   |  2 +-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index caf61437d18..0bd3e80cf7f 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -173,6 +173,18 @@ AARCH64_CORE("cortex-a76.cortex-a55",  cortexa76cortexa55, 
cortexa53, V8_2A,  (F
 AARCH64_CORE("cortex-r82", cortexr82, cortexa53, V8R, (), cortexa53, 0x41, 
0xd15, -1)
 AARCH64_CORE("cortex-r82ae", cortexr82ae, cortexa53, V8R, (), cortexa53, 0x41, 
0xd14, -1)
 
+/* Apple (A12 and M) cores based on Armv8.
+   Apple implementer ID from xnu,
+   Guesses for part # and suitable scheduler ident, generic_armv8_a for costs.
+   A12 seems mostly 8.3,
+   M1 seems to be 8.4 + extras (see comments in option-extensions about 
f16fml),
+   M2 mostly 8.5 but with missing mandatory features.
+   M3 is essentially the same as M2 for the features declared here.  */
+AARCH64_CORE("apple-a12", applea12, cortexa53, V8_3A,  (), generic_armv8_a, 
0x61, 0x12, -1)
+AARCH64_CORE("apple-m1", applem1, cortexa57, V8_4A,  (F16, SB, SSBS), 
generic_armv8_a, 0x61, 0x23, -1)
+AARCH64_CORE("apple-m2", applem2, cortexa57, V8_4A,  (I8MM, BF16, F16, SB, 
SSBS), generic_armv8_a, 0x61, 0x23, -1)
+AARCH64_CORE("apple-m3", applem3, cortexa57, V8_4A,  (I8MM, BF16, F16, SB, 
SSBS), generic_armv8_a, 0x61, 0x23, -1)
+
 /* Armv9.0-A Architecture Processors.  */
 
 /* Arm ('A') cores. */

-- 
2.39.2 (Apple Git-143)



Re: [PATCH v4] c/c++: UX improvements to 'too {few,many} arguments' errors [PR118112]

2025-01-11 Thread Jason Merrill

On 1/10/25 1:47 PM, David Malcolm wrote:

On Thu, 2025-01-09 at 22:28 -0500, David Malcolm wrote:

On Thu, 2025-01-09 at 21:15 -0500, Jason Merrill wrote:

On 1/9/25 7:00 PM, David Malcolm wrote:

On Thu, 2025-01-09 at 14:21 -0500, Jason Merrill wrote:

Thanks for taking a look...


On 1/9/25 2:11 PM, David Malcolm wrote:

@@ -4743,7 +4769,38 @@ convert_arguments (tree typelist,
vec **values, tree fndecl,
  if (typetail && typetail != void_list_node)
    {
      if (complain & tf_error)
-       error_args_num (input_location, fndecl,
/*too_many_p=*/false);
+       {
+     /* Not enough args.
+Determine minimum number of arguments
required.  */
+     int min_expected_num = 0;
+     bool at_least_p = false;
+     tree iter = typelist;
+     while (true)
+   {
+     if (!iter)
+       {
+     /* Variadic arguments; stop
iterating.
*/
+     at_least_p = true;
+     break;
+       }
+     if (iter == void_list_node)
+       /* End of arguments; stop iterating.  */
+       break;
+     if (fndecl && TREE_PURPOSE (iter)
+     && TREE_CODE (TREE_PURPOSE (iter)) !=
DEFERRED_PARSE)



Why are you checking DEFERRED_PARSE?  That indicates a default
argument,
even if it isn't parsed yet.  For that case we should get the
error
in
convert_default_arg rather than pretend there's no default
argument.


I confess that the check for DEFERRED_PARSE was a rather mindless
copy
and paste by me from the "See if there are default arguments that
can be
used" logic earlier in the function.

I've removed it in the latest version of the patch.
   

+       {
+     /* Found a default argument; skip this
one when
+counting minimum required.  */
+     at_least_p = true;
+     iter = TREE_CHAIN (iter);
+     continue;


We could break here, once you have a default arg the rest of
the
parms
need to have them as well.


Indeed; I've updated this in the latest version of the patch, so
we break out as soon as we see an arg with a non-null
TREE_PURPOSE.




+       }
+     ++min_expected_num;
+     iter = TREE_CHAIN (iter);
+   }
+     error_args_num (input_location, fndecl,
+     min_expected_num, actual_num,
at_least_p);
+       }
      return -1;
    }


Here's a v3 version of the patch, which is currently going
through
my tester.

OK for trunk if it passes bootstrap®rtesting?


OK.


Thanks.  However, it turns out that I may have misspoke, and the v2
patch might have been the correct approach: if we bail out on the
first
arg with a TREE_PURPOSE then in
gcc/testsuite/g++.dg/cpp0x/variadic169.C

    1   │ // DR 2233
    2   │ // { dg-do compile { target c++11 } }
    3   │
    4   │ template void f(int n = 0, T ...t);
    5   │
    6   │ int main()
    7   │ {
    8   │   f(); // { dg-error "too few arguments to
function '\[^\n\r\]*'; expected at least 1, have 0" }
    9   │ }

we instead emit the nonsensical "expected at least 0, have 0":


Ah, right.


error: too few arguments to function ‘void f(int, T ...) [with T =
{int}]’; expected at least 0, have 0
     8 |   f(); // { dg-error "too few
arguments to function '\[^\n\r\]*'; expected at least 1, have 0" }
   |   ~~^~
../../src/gcc/testsuite/g++.dg/cpp0x/variadic169.C:4:30: note:
declared
here
     4 | template void f(int n = 0, T ...t);
   |  ^

whereas with the v2 patch we count the trailing arg after the default
arg, and we emit "expected at least 1, have 0", which seems correct
to me.


In that case I think the actual minimum is 2, we can't use the default 
argument.


The corresponding code in add_function_candidate uses 
remaining_arguments; it would be good to harmonize them.



I'm testing a version of the patch that continues to iterate after a
TREE_PURPOSE (as v2, but but dropping the check on DEFERRED_PARSE).

Thanks
Dave


Hi Jason

Here's an updated version of the patch which drops the check on
DEFERRED_PARSE, but continues to iterate on TREE_PURPOSE, for dealing
with the case of explicit template args where default args are followed
by missing mandatory args (the code above mentions DR777, so I
referenced that).

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

Are the C++ parts OK for trunk?

Thanks
Dave


Consider this case of a bad call to a callback function (perhaps
due to C23 changing the meaning of () in function decls):

struct p {
 int (*bar)();
};

void baz() {
 struct p q;
 q.bar(1);
}

Before this patch the C frontend emits:

t.c: In function 'baz':
t.c:7:5: error: too many arguments to function 'q.bar'
 7 | q.bar(1);
   | ^

Re: [PATCH, v2] Fortran: implement F2018 intrinsic OUT_OF_RANGE [PR115788]

2025-01-11 Thread Thomas Koenig

Hi Harald,


Thanks for pointing this out!  I've also added a few gcc_unreachable()
to prevent other potential false positives, see attached.


Just a couple of documentation nits:  The documentation says INTEGER
or REAL only, but it also works (as an extension) for UNSIGNED. Also,
OUT_OF_RANGE is listed as not being implemented in gfortran in the
subsection about unsigned numbers.

OK if you mention these (or commit as-is and the documentation fixes
can be done later).

Best regards, and thanks a lot for the patch!

Best regards

Thomas



[PATCH] RISC-V: fix thinko in riscv_register_move_cost ()

2025-01-11 Thread Vineet Gupta
This seeming benign mistake caused a massive SPEC2017 Cactu regression
(2.1 trillion insn to 2.5 trillion) wiping out all the gains from my
recent sched1 improvement. Thankfully the issue was trivial to fix even
if hard to isolate.

On BPI3:

Before bug
--
|  Performance counter stats for './cactusBSSN_r_base-1':
|
|   4,557,471.02 msec task-clock:u #1.000 CPUs 
utilized
|  1,245  context-switches:u   #0.273 /sec
|  1  cpu-migrations:u #0.000 /sec
|205,376  page-faults:u#   45.064 /sec
|  7,291,944,801,307  cycles:u #1.600 GHz
|  2,134,835,735,951  instructions:u   #0.29  insn per 
cycle
| 10,799,296,738  branches:u   #2.370 M/sec
| 15,308,966  branch-misses:u  #0.14% of all 
branches
|
| 4557.710508078 seconds time elapsed

Bug
---
|  Performance counter stats for './cactusBSSN_r_base-2':
|
|   4,801,813.79 msec task-clock:u #1.000 CPUs 
utilized
|  8,066  context-switches:u   #1.680 /sec
|  1  cpu-migrations:u #0.000 /sec
|203,836  page-faults:u#   42.450 /sec
|  7,682,826,638,790  cycles:u #1.600 GHz
|  2,503,133,291,344  instructions:u   #0.33  insn per 
cycle
   ^
| 10,799,287,796  branches:u   #2.249 M/sec
| 16,641,200  branch-misses:u  #0.15% of all 
branches
|
| 4802.616638386 seconds time elapsed
|

Fix
---
|  Performance counter stats for './cactusBSSN_r_base-3':
|
|   4,556,170.75 msec task-clock:u #1.000 CPUs 
utilized
|  1,739  context-switches:u   #0.382 /sec
|  0  cpu-migrations:u #0.000 /sec
|203,458  page-faults:u#   44.655 /sec
|  7,289,854,613,923  cycles:u #1.600 GHz
|  2,134,854,070,916  instructions:u   #0.29  insn per 
cycle
| 10,799,296,807  branches:u   #2.370 M/sec
| 15,403,357  branch-misses:u  #0.14% of all 
branches
|
| 4556.445490123 seconds time elapsed

Fixes: 46888571d242 "RISC-V: Add cr and cf constraint"
Signed-off-by: Vineet Gupta 

gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_register_move_cost): Remove buggy
check.

Signed-off-by: Vineet Gupta 
---
 gcc/config/riscv/riscv.cc | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 65e09842fde8..d6356350dd54 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -9591,8 +9591,7 @@ riscv_register_move_cost (machine_mode mode,
   bool from_is_gpr = from == GR_REGS || from == RVC_GR_REGS;
   bool to_is_fpr = to == FP_REGS || to == RVC_FP_REGS;
   bool to_is_gpr = to == GR_REGS || to == RVC_GR_REGS;
-  if ((from_is_fpr && to == to_is_gpr) ||
-  (from_is_gpr && to_is_fpr))
+  if ((from_is_fpr && to_is_gpr) || (from_is_gpr && to_is_fpr))
 return tune_param->fmv_cost;
 
   if (from == V_REGS)
-- 
2.43.0



Re: [PATCH] testsuite: The expect framework might introduce CR in output

2025-01-11 Thread Jonathan Wakely
On Sat, 11 Jan 2025, 19:14 Torbjorn SVENSSON, 
wrote:

>
>
> On 2025-01-11 20:05, Jonathan Wakely wrote:
> >
> >
> > On Sat, 11 Jan 2025, 18:31 Torbjörn SVENSSON,
> > mailto:torbjorn.svens...@foss.st.com>>
> > wrote:
> >
> > Ok for trunk and releases/gcc-14?
> >
> >
> > OK, thanks
>
> Oh, mid-air-collision.
> Thanks for the fast review Jonathan!
> I suppose my v2 should also be ok as it (in addition to
> 27_io/print/1.cc) fixes the same kind of issue in 27_io/print/3.cc.
>


Yes, v2 is ok too


> Kind regards,
> Torbjörn
>
> >
> >
> >
> > --
> >
> > When running tests using the "sim" config, the command is launched in
> > non-readonly mode and the text retrieved from the expect command will
> > then replace all LF with CRLF. (The problem can be found in sim_load
> > where it calls remote_spawn without an input file).
> >
> > libstdc++-v3/ChangeLog:
> >
> >  * 27_io/print/1.cc: Allow both LF and CRLF in test.
> >
> > Signed-off-by: Torbjörn SVENSSON  > >
> > ---
> >   libstdc++-v3/testsuite/27_io/print/1.cc | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/libstdc++-v3/testsuite/27_io/print/1.cc b/libstdc++-v3/
> > testsuite/27_io/print/1.cc
> > index f6585d9880a..2a74e5002f4 100644
> > --- a/libstdc++-v3/testsuite/27_io/print/1.cc
> > +++ b/libstdc++-v3/testsuite/27_io/print/1.cc
> > @@ -18,7 +18,7 @@ void
> >   test_println_default()
> >   {
> > std::println("I walk the line");
> > -  // { dg-output "I walk the line\n" }
> > +  // { dg-output "I walk the line\r?\n" }
> >   }
> >
> >   void
> > --
> > 2.25.1
> >
>
>


Re: [PATCH] testsuite: libstdc++: Use effective-target libatomic

2025-01-11 Thread Jonathan Wakely
On Mon, 23 Dec 2024, 19:05 Torbjörn SVENSSON, 
wrote:

> Ok for trunk and releases/gcc-14?
>

OK



> --
>
> Test assumes libatomic.a is always available, but for some embedded
> targets, there is no libatomic.a and the test thus fail.
>
> libstdc++/ChangeLog:
>
> * 29_atomics/atomic_float/compare_exchange_padding.cc: Use
> effective-target libatomic_available.
>
> Signed-off-by: Torbjörn SVENSSON 
> ---
>  .../29_atomics/atomic_float/compare_exchange_padding.cc  | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git
> a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> index 49626ac6651..9395e3026a7 100644
> ---
> a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> +++
> b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
> @@ -1,5 +1,6 @@
>  // { dg-do run { target c++20 } }
>  // { dg-options "-O0" }
> +// { dg-require-effective-target libatomic_available }
>  // { dg-additional-options "[atomic_link_flags [get_multilibs]] -latomic"
> }
>
>  #include 
> --
> 2.25.1
>
>


Re: [RFA] [PR rtl-optimization/107455] Eliminate unnecessary constant load

2025-01-11 Thread Jeff Law




On 1/3/25 11:46 AM, Richard Sandiford wrote:

Jeff Law  writes:

This resurrects a patch from a bit over 2 years ago that I never wrapped
up.  IIRC, I ended up up catching covid, then in the hospital for an
unrelated issue and it just got dropped on the floor in the insanity.

The basic idea here is to help postreload-cse eliminate more
const/copies by recording a small set of conditional equivalences (as
Richi said in 2022, "Ick").

It was originally to help eliminate an unnecessary constant load I saw
in coremark, but as seen in BZ107455 the same issues show up in real
code as well.

Richard B and Richard S both had good comments last time around and
their requests are reflected in this update:

- Use rtx_equal_p rather than pointer equality
- Restrict to register "destinations"
- Restrict to integer modes
- Adjust entry block handling

My own wider scale testing resulted in a few more changes.

- Robustify extracting the (set (pc) ... ), which then required ...
- Handle if src/dst are clobbered by the conditional branch
- Fix logic error causing too many equivalences to be recorded


Heh.  I delayed reviewing this for a bit because I have no memory
of seeing it before (sign I'm getting old).  I haven't found my
previous comments, so sorry if this completely contradicts what
I said before...
Happens to me all the time.  I've even had cases where I found my 
comments and have no recollection of them at all.  It's depressing.





@@ -221,13 +222,116 @@ reload_cse_regs_1 (void)
init_alias_analysis ();
  
FOR_EACH_BB_FN (bb, cfun)

-FOR_BB_INSNS (bb, insn)
-  {
-   if (INSN_P (insn))
- cfg_changed |= reload_cse_simplify (insn, testreg);
+{
+  /* If BB has a small number of predecessors, see if each of the
+has the same implicit set.  If so, record that implicit set so
+that we can add it to the cselib tables.  */
+  rtx_insn *implicit_set;
  
-	cselib_process_insn (insn);

-  }
+  implicit_set = NULL;
+  if (EDGE_COUNT (bb->preds) <= 3)
+   {
+ edge e;
+ edge_iterator ei;
+ rtx src = NULL_RTX;
+ rtx dest = NULL_RTX;
+
+ /* Iterate over each incoming edge and see if they
+all have the same implicit set.  */
+ FOR_EACH_EDGE (e, ei, bb->preds)
+   {
+ /* If the predecessor does not end in a conditional
+jump, then it does not have an implicit set.  */
+ if (e->src == ENTRY_BLOCK_PTR_FOR_FN (cfun)
+ || !block_ends_with_condjump_p (e->src))


AFAICT, this is directly equivalent to:

   any_condjump_p (BB_END (e->src))

as checked below, so I think we should just check that (trusting
that BB_END is nonnull if e->src isn't the entry block).  In particular...

Agreed.




+   break;
+
+ /* We know the predecessor ends with a conditional
+jump.  Now dig into the actal form of the jump
+to potentially extract an implicit set.  */
+ rtx_insn *condjump = BB_END (e->src);
+ if (!condjump
+ || ! any_condjump_p (condjump)
+ || ! onlyjump_p (condjump))
+   break;


...the onlyjump_p doesn't seem necessary given the later reg_set_p.
Instead we can use...

Also agreed.




+
+ /* This predecessor ends with a possible equivalence
+producing conditional branch.  Extract the condition
+and try to use it to create an equivalence.  */
+ rtx pat = single_set (condjump);


...pc_set instead of single_set here.

Agreed that's clearer on intent.




+ rtx i_t_e = SET_SRC (pat);
+ gcc_assert (GET_CODE (i_t_e) == IF_THEN_ELSE);
+ rtx cond = XEXP (i_t_e, 0);
+ if ((GET_CODE (cond) == EQ
+  && GET_CODE (XEXP (i_t_e, 1)) == LABEL_REF
+  && XEXP (XEXP (i_t_e, 1), 0) == BB_HEAD (bb))
+ || (GET_CODE (cond) == NE
+ && XEXP (i_t_e, 2) == pc_rtx
+ && e->src->next_bb == bb))


How about using:

  if (((e->flags & EDGE_FALLTHRU) != 0)
  == (XEXP (i_t_e, 1) == pc_rtx)
   ? GET_CODE (cond) == EQ
   : GET_CODE (cond) == NE)

(borrowing a construct from cfgcleanup.cc).  That seems more general,
since some targets allow (if_then_else ... (pc) (label_ref)).
It's also a bit shorter.
Wow.   I had to read that a few times, but yea, it does appear to be 
correct to me.  I added a level of parenthesis which helps clarify 
things ever-to-slightly IMHO.


BTW, at some point in the past I played around with supporting the 
reversed pc/label_ref on a port (probably the H8).  I was surprised at 
how rarely the reversed case came up in practice.






+   {
+ /* If this is the first time through record
+the source and desti

Re: [PATCH] testsuite: libstdc++: Use effective-target libatomic

2025-01-11 Thread Torbjorn SVENSSON




On 2025-01-12 01:05, Jonathan Wakely wrote:



On Mon, 23 Dec 2024, 19:05 Torbjörn SVENSSON, 
mailto:torbjorn.svens...@foss.st.com>> 
wrote:


Ok for trunk and releases/gcc-14?


OK


Pushed as r15-6828-g4b0ef49d02f and r14.2.0-680-gd82fc939f91.

Kind regards,
Torbjörn





--

Test assumes libatomic.a is always available, but for some embedded
targets, there is no libatomic.a and the test thus fail.

libstdc++/ChangeLog:

         * 29_atomics/atomic_float/compare_exchange_padding.cc: Use
         effective-target libatomic_available.

Signed-off-by: Torbjörn SVENSSON mailto:torbjorn.svens...@foss.st.com>>
---
  .../29_atomics/atomic_float/compare_exchange_padding.cc          | 1 +
  1 file changed, 1 insertion(+)

diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_float/
compare_exchange_padding.cc b/libstdc++-v3/testsuite/29_atomics/
atomic_float/compare_exchange_padding.cc
index 49626ac6651..9395e3026a7 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic_float/
compare_exchange_padding.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/
compare_exchange_padding.cc
@@ -1,5 +1,6 @@
  // { dg-do run { target c++20 } }
  // { dg-options "-O0" }
+// { dg-require-effective-target libatomic_available }
  // { dg-additional-options "[atomic_link_flags [get_multilibs]] -
latomic" }

  #include 
-- 
2.25.1






Re: [PATCH] testsuite: The expect framework might introduce CR in output

2025-01-11 Thread Torbjorn SVENSSON




On 2025-01-12 01:04, Jonathan Wakely wrote:



On Sat, 11 Jan 2025, 19:14 Torbjorn SVENSSON, 
mailto:torbjorn.svens...@foss.st.com>> 
wrote:




On 2025-01-11 20:05, Jonathan Wakely wrote:
 >
 >
 > On Sat, 11 Jan 2025, 18:31 Torbjörn SVENSSON,
 > mailto:torbjorn.svens...@foss.st.com>
>>
 > wrote:
 >
 >     Ok for trunk and releases/gcc-14?
 >
 >
 > OK, thanks

Oh, mid-air-collision.
Thanks for the fast review Jonathan!
I suppose my v2 should also be ok as it (in addition to
27_io/print/1.cc) fixes the same kind of issue in 27_io/print/3.cc.



Yes, v2 is ok too


Pushed as r15-6829-g4b29be7216d and r14.2.0-681-gf43a6bd0879.

Kind regards,
Torbjörn




Kind regards,
Torbjörn

 >
 >
 >
 >     --
 >
 >     When running tests using the "sim" config, the command is
launched in
 >     non-readonly mode and the text retrieved from the expect
command will
 >     then replace all LF with CRLF. (The problem can be found in
sim_load
 >     where it calls remote_spawn without an input file).
 >
 >     libstdc++-v3/ChangeLog:
 >
 >              * 27_io/print/1.cc: Allow both LF and CRLF in test.
 >
 >     Signed-off-by: Torbjörn SVENSSON
mailto:torbjorn.svens...@foss.st.com>
 >     >>
 >     ---
 >       libstdc++-v3/testsuite/27_io/print/1.cc | 2 +-
 >       1 file changed, 1 insertion(+), 1 deletion(-)
 >
 >     diff --git a/libstdc++-v3/testsuite/27_io/print/1.cc b/
libstdc++-v3/
 >     testsuite/27_io/print/1.cc
 >     index f6585d9880a..2a74e5002f4 100644
 >     --- a/libstdc++-v3/testsuite/27_io/print/1.cc
 >     +++ b/libstdc++-v3/testsuite/27_io/print/1.cc
 >     @@ -18,7 +18,7 @@ void
 >       test_println_default()
 >       {
 >         std::println("I walk the line");
 >     -  // { dg-output "I walk the line\n" }
 >     +  // { dg-output "I walk the line\r?\n" }
 >       }
 >
 >       void
 >     --
 >     2.25.1
 >





Re: [PATCH] rtl-optimization/117467 - limit ext-dce memory use

2025-01-11 Thread Richard Biener



> Am 10.01.2025 um 22:17 schrieb Jeff Law :
> 
> 
> 
>> On 1/10/25 4:41 AM, Richard Biener wrote:
>> The following puts in a hard limit on ext-dce because it might end
>> up requiring memory on the order of the number of basic blocks
>> times the number of pseudo registers.  The limiting follows what
>> GCSE based passes do and thus I re-use --param max-gcse-memory here.
>> This doesn't in any way address the implementation issues of the pass,
>> but it reduces the memory-use when compiling the
>> module_first_rk_step_part1.F90 TU from 521.wrf_r from 25GB to 1GB.
>> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
>> I plan to push this later today unless I hear objection.
>>PR rtl-optimization/117467
>>PR rtl-optimization/117934
>>* ext-dce.cc (ext_dce_execute): Do nothing if a memory
>>allocation estimate exceeds what is allowed by
>>--param max-gcse-memory.
> No objections.  Thanks for taking care of it.  A bit surprised it's this bad, 
> though I guess those bitmaps might be dense for pathological cases.

I‘m quite sure there’s something wrong with the pass, in particular how it 
treats live (it doesn’t seem to prune based on kills?).  But I don’t have time 
to investigate nor less to rewrite its ‘dataflow’.

Richard 

> 
> Jeff
> 


[PATCH] phiopt: Reject hot/cold predictors for early phiopt [PR117935]

2025-01-11 Thread Andrew Pinski
In this case, early phiopt would get rid of the user provided predicator
for hot/cold as it would remove the basic blocks. The easiest and best option is
for early phi-opt don't do phi-opt if the middle basic-block(s) have either
a hot or cold predict statement. Then after inlining, jump threading will most 
likely
happen and that will keep around the predictor.

Note this only needs to be done for match_simplify_replacement and not the other
phi-opt functions because currently only match_simplify_replacement is able to 
skip
middle bb with predicator statements in it.

OK? Bootstrapped and tested on x86_64-linux-gnu.

PR tree-optimization/117935

gcc/ChangeLog:

* tree-ssa-phiopt.cc (contains_hot_cold_predict): New function.
(match_simplify_replacement): Return early if early_p and one of
the middle bb(s) have a hot/cold predict statement.

gcc/testsuite/ChangeLog:

* gcc.dg/predict-24.c: New test.

Signed-off-by: Andrew Pinski 
---
 gcc/testsuite/gcc.dg/predict-24.c | 24 +
 gcc/tree-ssa-phiopt.cc| 35 +++
 2 files changed, 59 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/predict-24.c

diff --git a/gcc/testsuite/gcc.dg/predict-24.c 
b/gcc/testsuite/gcc.dg/predict-24.c
new file mode 100644
index 000..b2c8a77c323
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/predict-24.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
+/* PR tree-optimization/117935 */
+
+static inline bool has_value(bool b)
+{
+  if (b)
+  {
+[[gnu::hot, gnu::unused]] label1:
+return true;
+  }
+  else
+return false;
+}
+/* The hot label should last until it gets inlined into value_or and 
jump_threaded.  */
+int value_or(bool b, int def0, int def1)
+{
+  if (has_value(b))
+return def0;
+  else
+return def1;
+}
+/* { dg-final { scan-tree-dump-times "first match heuristics: 90.00%" 2 
"profile_estimate"} } */
+/* { dg-final { scan-tree-dump-times "hot label heuristics of edge 
\[0-9\]+->\[0-9]+: 90.00%" 2 "profile_estimate"} } */
diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
index 64d3ba9e160..8fed3eadbb0 100644
--- a/gcc/tree-ssa-phiopt.cc
+++ b/gcc/tree-ssa-phiopt.cc
@@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-dce.h"
 #include "calls.h"
 #include "tree-ssa-loop-niter.h"
+#include "gimple-predict.h"
 
 /* Return the singleton PHI in the SEQ of PHIs for edges E0 and E1. */
 
@@ -916,6 +917,27 @@ auto_flow_sensitive::~auto_flow_sensitive ()
 p.second.restore (p.first);
 }
 
+/* Returns true if BB contains an user provided predictor
+   (PRED_HOT_LABEL/PRED_COLD_LABEL).  */
+
+static bool
+contains_hot_cold_predict (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  gsi = gsi_start_nondebug_after_labels_bb (bb);
+  for (; !gsi_end_p (gsi); gsi_next_nondebug (&gsi))
+{
+  gimple *s = gsi_stmt (gsi);
+  if (gimple_code (s) != GIMPLE_PREDICT)
+   continue;
+  auto predict = gimple_predict_predictor (s);
+  if (predict == PRED_HOT_LABEL
+ || predict == PRED_COLD_LABEL)
+   return true;
+}
+  return false;
+}
+
 /*  The function match_simplify_replacement does the main work of doing the
 replacement using match and simplify.  Return true if the replacement is 
done.
 Otherwise return false.
@@ -953,6 +975,19 @@ match_simplify_replacement (basic_block cond_bb, 
basic_block middle_bb,
  stmt_to_move_alt))
 return false;
 
+  /* For early phiopt, we don't want to lose user generated predictors,
+ that is if either middle bb contains either a hot or cold label predict,
+ the phiopt should be skipped. */
+  if (early_p)
+{
+  if (contains_hot_cold_predict (middle_bb))
+   return false;
+  if (threeway_p
+ && middle_bb != middle_bb_alt
+ && contains_hot_cold_predict (middle_bb_alt))
+   return false;
+}
+
   /* Do not make conditional undefs unconditional.  */
   if ((TREE_CODE (arg0) == SSA_NAME
&& ssa_name_maybe_undef_p (arg0))
-- 
2.43.0



Re: [PATCH v2] RISC-V: Fix riscv_modes_tieable_p

2025-01-11 Thread Zhijin Zeng
I'm so sorry that I didn't describe the patch clearly. I refactored the patch 
and added some new  changes. Initially I split them into two patches, which is 
probably not right.

Thanks,
Zhijin

>From ca072f040c876df4117f475eeb74c7eb8882bed8 Mon Sep 17 00:00:00 2001
From: Zhijin Zeng 
Date: Sat, 11 Jan 2025 12:09:11 +0800
Subject: [PATCH] RISC-V: Fix mode compatibility between floating-point and
 integer value

I find there are some unnecessary fmv instructions in glibc math exp,
and reduce exp function to the attached test case. The unnecessary
fmv instructions as follow:

```
        fld     fa4,16(a4)
        fmadd.d fa2,fa2,fa0,fa5
        fld     fa0,56(a4)
        fmv.x.d a5,fa2             *
        fld     fa2,48(a4)
        fmv.d.x fa1,a5             *
        andi    a3,a5,127
        addi    a3,a3,15
        fsub.d  fa5,fa1,fa5
        slli    a3,a3,3
```

The data of fa2 and fa1 are the same, we should directly use fa2
rather than fa1 in following instructions and save one fmv instruction.

The `fmv.d.x a5,fa2` is correspond to pattern `(subreg:DI (reg/v:DF 143`.
In ira pass, virtual register r143 is assigned to GP_REGS, so its data
need be copied to FP_REGS before `fsub.d fa5,fa1,fa5` by reload pass,
and that's exactly the `fmv.d.x fa1,a5` instruction.

To fix this, we need to change the 4 functions.

1. FP_REGS can't be used for virtual register 143 in `(subreg:DI (reg/v:DF 143`.
I change riscv_hard_regno_mode_ok and riscv_can_change_mode_class to
support it. RISC-V fmv variant instructions can be used to move data
between FP_REGS and GR_REGS, and FP_REGS are compatible with DImode and
SImode.

2. The fwprop1 pass will fold the `(subreg:DI (reg/v:DF 143` and make
the cost calcutation method get incorrect cost and still assign r143 to
GP_REGS. Actually FP_REGS and GR_REGS are not tieable without Zfinx
extension, and need to fix riscv_modes_tieable_p.

3. JALR_REGS is also a subset of GR_REGS and need to be taken
into acount in riscv_register_move_cost, otherwise it will get
a incorrect cost.

gcc/ChangeLog:

        * config/riscv/riscv.cc (riscv_register_move_cost):
        (riscv_hard_regno_mode_ok):
        (riscv_modes_tieable_p):
        (riscv_can_change_mode_class):

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/fwprop1-modes-tieable.c: New test.
---
 gcc/config/riscv/riscv.cc                     | 31 ++-
 .../gcc.target/riscv/fwprop1-modes-tieable.c  | 82 +++
 2 files changed, 110 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/fwprop1-modes-tieable.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 65e09842fde..8a8444d6b92 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -9588,9 +9588,10 @@ riscv_register_move_cost (machine_mode mode,
                          reg_class_t from, reg_class_t to)
 {
   bool from_is_fpr = from == FP_REGS || from == RVC_FP_REGS;
-  bool from_is_gpr = from == GR_REGS || from == RVC_GR_REGS;
+  bool from_is_gpr = from == GR_REGS || from == RVC_GR_REGS
+                    || from == JALR_REGS;
   bool to_is_fpr = to == FP_REGS || to == RVC_FP_REGS;
-  bool to_is_gpr = to == GR_REGS || to == RVC_GR_REGS;
+  bool to_is_gpr = to == GR_REGS || to == RVC_GR_REGS || to == JALR_REGS;
   if ((from_is_fpr && to == to_is_gpr) ||
       (from_is_gpr && to_is_fpr))
     return tune_param->fmv_cost;
@@ -9696,7 +9697,10 @@ riscv_hard_regno_mode_ok (unsigned int regno, 
machine_mode mode)
        return false;

       if (GET_MODE_CLASS (mode) != MODE_FLOAT
-         && GET_MODE_CLASS (mode) != MODE_COMPLEX_FLOAT)
+         && GET_MODE_CLASS (mode) != MODE_COMPLEX_FLOAT
+         && GET_MODE_CLASS (mode) != MODE_COMPLEX_FLOAT
+         && !((mode == DImode && TARGET_DOUBLE_FLOAT && TARGET_64BIT)
+               || (mode == SImode && TARGET_HARD_FLOAT)))
        return false;

       /* Only use callee-saved registers if a potential callee is guaranteed
@@ -9753,6 +9757,11 @@ riscv_modes_tieable_p (machine_mode mode1, machine_mode 
mode2)
      E.g. V2SI and DI are not tieable.  */
   if (riscv_v_ext_mode_p (mode1) != riscv_v_ext_mode_p (mode2))
     return false;
+  if ((GET_MODE_CLASS (mode1) == MODE_FLOAT
+       && GET_MODE_CLASS (mode2) == MODE_INT)
+       || (GET_MODE_CLASS (mode2) == MODE_FLOAT
+       && GET_MODE_CLASS (mode1) == MODE_INT))
+    return false;
   return (mode1 == mode2
          || !(GET_MODE_CLASS (mode1) == MODE_FLOAT
               && GET_MODE_CLASS (mode2) == MODE_FLOAT));
@@ -11160,7 +11169,23 @@ riscv_can_change_mode_class (machine_mode from, 
machine_mode to,
       && riscv_v_ext_vls_mode_p (from)
       && !ordered_p (BITS_PER_RISCV_VECTOR, GET_MODE_PRECISION (from)))
       return false;
+  if ((SCALAR_INT_MODE_P (to) || SCALAR_FLOAT_MODE_P (to))
+       && (rclass == FP_REGS || rclass == RVC_FP_REGS
+       || rclass == GR_REGS || rclass == RVC_GR_REGS
+       || rclass == JALR_REGS))
+    {
+      /* fmv.x.w or fmv.w.x. 

[PATCH]AArch64: don't override march to assembler with mcpu if march is specified [PR110901]

2025-01-11 Thread Tamar Christina
Hi All,

When both -mcpu and -march are specified, the value of -march wins out.

This is done correctly for the calls to cc1 and for the assembler directives we
put out in assembly files.

However in the call to as we don't do this and instead use the arch from the
cpu.  This leads to a situation that GCC cannot reliably be used to compile
assembly files which don't have a .arch directive.

This is quite common with .S files which use macros to selectively enable
codepath based on what the preprocessor sees.

The fix is to change MCPU_TO_MARCH_SPEC to not override the march if an march
is already specified.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

PR target/110901
* config/aarch64/aarch64.h (MCPU_TO_MARCH_SPEC): Don't override if
march is set.

gcc/testsuite/ChangeLog:

PR target/110901
* gcc.target/aarch64/options_set_29.c: New test.

---
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 
1ab49e229b080d29187690abdb0c0767c12a157a..218868a5246a19c293c03e7be48107c1b0770e27
 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -1510,7 +1510,7 @@ extern const char *host_detect_local_cpu (int argc, const 
char **argv);
   CONFIG_TUNE_SPEC
 
 #define MCPU_TO_MARCH_SPEC \
-   " %{mcpu=*:-march=%:rewrite_mcpu(%{mcpu=*:%*})}"
+   "%{!march=*:%{mcpu=*:-march=%:rewrite_mcpu(%{mcpu=*:%*})}}"
 
 extern const char *aarch64_rewrite_mcpu (int argc, const char **argv);
 extern const char *is_host_cpu_not_armv8_base (int argc, const char **argv);
diff --git a/gcc/testsuite/gcc.target/aarch64/options_set_29.c 
b/gcc/testsuite/gcc.target/aarch64/options_set_29.c
new file mode 100644
index 
..0a68550951ce61422c4cfea50762b2420a92091e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/options_set_29.c
@@ -0,0 +1,11 @@
+/* { dg-do assemble } */
+/* { dg-additional-options "-march=armv8.2-a+sve -mcpu=cortex-a72 -O1 -w -###" 
} */
+
+int main ()
+{
+  return 0;
+}
+
+/* { dg-message "-march=armv8-a\+crc" "no arch from cpu" { xfail *-*-* } 0 } */
+/* { dg-message "-march=armv8\\.2-a\\+sve" "using only sve" { target *-*-* } 0 
} */
+/* { dg-excess-errors "" } */




-- 
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 1ab49e229b080d29187690abdb0c0767c12a157a..218868a5246a19c293c03e7be48107c1b0770e27 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -1510,7 +1510,7 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
   CONFIG_TUNE_SPEC
 
 #define MCPU_TO_MARCH_SPEC \
-   " %{mcpu=*:-march=%:rewrite_mcpu(%{mcpu=*:%*})}"
+   "%{!march=*:%{mcpu=*:-march=%:rewrite_mcpu(%{mcpu=*:%*})}}"
 
 extern const char *aarch64_rewrite_mcpu (int argc, const char **argv);
 extern const char *is_host_cpu_not_armv8_base (int argc, const char **argv);
diff --git a/gcc/testsuite/gcc.target/aarch64/options_set_29.c b/gcc/testsuite/gcc.target/aarch64/options_set_29.c
new file mode 100644
index ..0a68550951ce61422c4cfea50762b2420a92091e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/options_set_29.c
@@ -0,0 +1,11 @@
+/* { dg-do assemble } */
+/* { dg-additional-options "-march=armv8.2-a+sve -mcpu=cortex-a72 -O1 -w -###" } */
+
+int main ()
+{
+  return 0;
+}
+
+/* { dg-message "-march=armv8-a\+crc" "no arch from cpu" { xfail *-*-* } 0 } */
+/* { dg-message "-march=armv8\\.2-a\\+sve" "using only sve" { target *-*-* } 0 } */
+/* { dg-excess-errors "" } */





[PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-11 Thread Tamar Christina
Hi All,

in g:e91a17fe39c39e98cebe6e1cbc8064ee6846a3a7 we added the ability for
-mcpu=native on unknown CPUs to still enable architecture extensions.

This has worked great but was only added for homogenous systems.

However the same thing works for big.LITTLE as in such system the cores must
have the same extensions otherwise it doesn't fundamentally work.

i.e. task migration from one core to the other wouldn't work.

This extends the same handling to non-homogenous systems.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

PR target/113257
* config/aarch64/driver-aarch64.cc (host_detect_local_cpu):

gcc/testsuite/ChangeLog:

PR target/113257
* gcc.target/aarch64/cpunative/info_34: New test.
* gcc.target/aarch64/cpunative/native_cpu_34.c: New test.

---
diff --git a/gcc/config/aarch64/driver-aarch64.cc 
b/gcc/config/aarch64/driver-aarch64.cc
index 
45fce67a646351b848b7cd7d0fd35d343731c0d1..2a454daf031aa3ac81a9a2c03b15c09731e4f56e
 100644
--- a/gcc/config/aarch64/driver-aarch64.cc
+++ b/gcc/config/aarch64/driver-aarch64.cc
@@ -449,6 +449,20 @@ host_detect_local_cpu (int argc, const char **argv)
  break;
}
}
+
+  /* On big.LITTLE if we find any unknown CPUs we can still pick arch
+features as the cores should have the same features.  So just pick
+the feature flags from any of the cpus.  */
+  if (aarch64_cpu_data[i].name == NULL)
+   {
+ auto arch_info = get_arch_from_id (DEFAULT_ARCH);
+
+ gcc_assert (arch_info);
+
+ res = concat ("-march=", arch_info->name, NULL);
+ default_flags = arch_info->flags;
+   }
+
   if (!res)
goto not_found;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_34 
b/gcc/testsuite/gcc.target/aarch64/cpunative/info_34
new file mode 100644
index 
..61cb254785a4b9ec19ebe388402223c9a82af7ed
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_34
@@ -0,0 +1,18 @@
+processor  : 0
+BogoMIPS   : 100.00
+Features   : fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 
fphp asimdhp fcma
+CPU implementer: 0xfe
+CPU architecture: 8
+CPU variant: 0x0
+CPU part   : 0xd08
+CPU revision   : 2
+
+processor  : 0
+BogoMIPS   : 100.00
+Features   : fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 
fphp asimdhp fcma
+CPU implementer: 0xfe
+CPU architecture: 8
+CPU variant: 0x0
+CPU part   : 0xd09
+CPU revision   : 2
+
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_34.c 
b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_34.c
new file mode 100644
index 
..168140002a0f0205c0f552de0cce9b2d356e09e2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_34.c
@@ -0,0 +1,12 @@
+/* { dg-do compile { target { { aarch64*-*-linux*} && native } } } */
+/* { dg-set-compiler-env-var GCC_CPUINFO 
"$srcdir/gcc.target/aarch64/cpunative/info_34" } */
+/* { dg-additional-options "-mcpu=native" } */
+
+int main()
+{
+  return 0;
+}
+
+/* { dg-final { scan-assembler {\.arch armv8-a\+dotprod\+crc\+crypto\+sve2\n} 
} } */
+
+/* Test a normal looking procinfo.  */




-- 
diff --git a/gcc/config/aarch64/driver-aarch64.cc b/gcc/config/aarch64/driver-aarch64.cc
index 45fce67a646351b848b7cd7d0fd35d343731c0d1..2a454daf031aa3ac81a9a2c03b15c09731e4f56e 100644
--- a/gcc/config/aarch64/driver-aarch64.cc
+++ b/gcc/config/aarch64/driver-aarch64.cc
@@ -449,6 +449,20 @@ host_detect_local_cpu (int argc, const char **argv)
 	  break;
 	}
 	}
+
+  /* On big.LITTLE if we find any unknown CPUs we can still pick arch
+	 features as the cores should have the same features.  So just pick
+	 the feature flags from any of the cpus.  */
+  if (aarch64_cpu_data[i].name == NULL)
+	{
+	  auto arch_info = get_arch_from_id (DEFAULT_ARCH);
+
+	  gcc_assert (arch_info);
+
+	  res = concat ("-march=", arch_info->name, NULL);
+	  default_flags = arch_info->flags;
+	}
+
   if (!res)
 	goto not_found;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_34 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_34
new file mode 100644
index ..61cb254785a4b9ec19ebe388402223c9a82af7ed
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_34
@@ -0,0 +1,18 @@
+processor	: 0
+BogoMIPS	: 100.00
+Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 fphp asimdhp fcma
+CPU implementer	: 0xfe
+CPU architecture: 8
+CPU variant	: 0x0
+CPU part	: 0xd08
+CPU revision	: 2
+
+processor	: 0
+BogoMIPS	: 100.00
+Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 fphp asimdhp fcma
+CPU implementer	: 0xfe
+CPU architecture: 8
+CPU variant	: 0x0
+CPU part	: 0xd09
+CPU revision	: 2
+
diff --git a/gcc/testsuite/gcc.target/aarc

Re: [PATCH] c: improve UX for -Wincompatible-pointer-types and C23 [PR116871]

2025-01-11 Thread Jakub Jelinek
On Sat, Jan 11, 2025 at 01:55:42PM -0500, David Malcolm wrote:
> +  if ((takes_void_p (fntype_a) && takes_int_p (fntype_b))
> +  || (takes_int_p (fntype_a) && takes_void_p (fntype_b)))
> +{
> +  inform (UNKNOWN_LOCATION,
> +   "the meaning of %<()%> in function declarations changed"
> +   " in C23 from %<(int)%> to %<(void)%>");

That is not true.
The C17 and earlier meaning of rettype identifier (); or
rettype (*identifier) (); etc. is (in C17 6.7.6.3):
"The empty list in a function declarator that is not part of a definition of 
that function
specifies that no information about the number or types of the parameters is
supplied."
And per C17 6.11.6 it was:
"The use of function declarators with empty parentheses (not prototype-format 
parameter type
declarators) is an obsolescent feature."
E.g. the 6.7.6.3 example attempts to clarify that:
"The declaration
int f(void), *fip(), (*pfi)();
declares a function f with no parameters returning an int, a function fip with 
no parameter specification returning a pointer
to an int, and a pointer pfi to a function with no parameter specification 
returning an int. It is especially useful to compare
the last two. The binding of *fip() is *(fip()) , so that the declaration 
suggests, and the same construction in an expression
requires, the calling of a function fip, and then using indirection through the 
pointer result to yield an int. In the declarator
(*pfi)() , the extra parentheses are necessary to indicate that indirection 
through a pointer to a function yields a function
designator, which is then used to call the function; it returns an int."

As C23 doesn't have any syntax to specify a function with no parameter
specification returning sometype, I'm afraid you can't use something short
in %<%> for the old meaning, you should say maybe
"the meaning of %<()%> in function declarations which are not a definition"
"changed in C23 from function with no parameter specification to %<(void)%>"
or so.

And, the takes_int_p doesn't make any sense.
Sure, in the example you've given, because void (*int_stat)();
int_stat used to be a function pointer to a function with unspecified
parameter specification returning void, you could call through that
int_stat (0); if it actually pointed to a function with int argument,
but also int_stat (0.0); if it actually pointed to a function with double
argument, or int_stat ("foo", 1, 1LL); if it pointed to a function
with const char *, int, long long arguments, etc.

Note, in the K&R style, if there is a function definition
void
foo ()
{
}
then for the declaration it is still a function with no parameter
specification (how other functions see it), while in the actual
definition it has no parameters.  The functions declared with unspecified
parameter specification actually can't be defined with an ellipsis, and
do be compatible can have only types after default argument promotions
(so not float, {,{un,}signed} {char,short}).

Many years ago, most of the functions were just declared in C like
char *strstr ();
(or if they had int return type just used implicit declarations), and
then it is up to users to know the function has 2 arguments which need to be
pointers to char.  If you call such a function, the type of each passed
argument goes through default argument promotion and that is how it expects
the function to be called.

Jakub



Re: [PATCH] testsuite: The expect framework might introduce CR in output

2025-01-11 Thread Torbjorn SVENSSON




On 2025-01-11 20:05, Jonathan Wakely wrote:



On Sat, 11 Jan 2025, 18:31 Torbjörn SVENSSON, 
mailto:torbjorn.svens...@foss.st.com>> 
wrote:


Ok for trunk and releases/gcc-14?


OK, thanks


Oh, mid-air-collision.
Thanks for the fast review Jonathan!
I suppose my v2 should also be ok as it (in addition to 
27_io/print/1.cc) fixes the same kind of issue in 27_io/print/3.cc.


Kind regards,
Torbjörn





--

When running tests using the "sim" config, the command is launched in
non-readonly mode and the text retrieved from the expect command will
then replace all LF with CRLF. (The problem can be found in sim_load
where it calls remote_spawn without an input file).

libstdc++-v3/ChangeLog:

         * 27_io/print/1.cc: Allow both LF and CRLF in test.

Signed-off-by: Torbjörn SVENSSON mailto:torbjorn.svens...@foss.st.com>>
---
  libstdc++-v3/testsuite/27_io/print/1.cc | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/testsuite/27_io/print/1.cc b/libstdc++-v3/
testsuite/27_io/print/1.cc
index f6585d9880a..2a74e5002f4 100644
--- a/libstdc++-v3/testsuite/27_io/print/1.cc
+++ b/libstdc++-v3/testsuite/27_io/print/1.cc
@@ -18,7 +18,7 @@ void
  test_println_default()
  {
    std::println("I walk the line");
-  // { dg-output "I walk the line\n" }
+  // { dg-output "I walk the line\r?\n" }
  }

  void
-- 
2.25.1






[PING 1] [PATCH] testsuite: libstdc++: Use effective-target libatomic

2025-01-11 Thread Torbjorn SVENSSON

Gentle ping :)

Kind regards,
Torbjörn

On 2024-12-23 20:00, Torbjörn SVENSSON wrote:

Ok for trunk and releases/gcc-14?

--

Test assumes libatomic.a is always available, but for some embedded
targets, there is no libatomic.a and the test thus fail.

libstdc++/ChangeLog:

* 29_atomics/atomic_float/compare_exchange_padding.cc: Use
effective-target libatomic_available.

Signed-off-by: Torbjörn SVENSSON 
---
  .../29_atomics/atomic_float/compare_exchange_padding.cc  | 1 +
  1 file changed, 1 insertion(+)

diff --git 
a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc 
b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
index 49626ac6651..9395e3026a7 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
@@ -1,5 +1,6 @@
  // { dg-do run { target c++20 } }
  // { dg-options "-O0" }
+// { dg-require-effective-target libatomic_available }
  // { dg-additional-options "[atomic_link_flags [get_multilibs]] -latomic" }
  
  #include 




Re: [PATCH] c: improve UX for -Wincompatible-pointer-types and C23 [PR116871]

2025-01-11 Thread Jakub Jelinek
On Sat, Jan 11, 2025 at 08:25:32PM +0100, Jakub Jelinek wrote:
> That is not true.

Implementation-wise, in C17 say
void foo ();
void bar (void);
TYPE_ARG_TYPES (TREE_TYPE (foo_decl)) == NULL
TYPE_ARG_TYPES (TREE_TYPE (bar_decl)) == void_list_node
but in C23 both have TYPE_ARG_TYPES void_list_node, actually even
TREE_TYPE (foo_decl) == TREE_TYPE (bar_decl).

So, if you want to actually point out in diagnostics the difference
(because it makes no sense to talk about some C23 change for code which
wouldn't be valid in C17 either, say
void (*fp) (void);
void fn (int, int);
...
  fp = fn;
)
we'd need to remember in the function type (some flag on it) whether it
would be the unspecified arguments case in C17 and earlier or not.
But such flag can't affect type compatibility etc., it should be there
just for diagnostic purposes only.
And on declaration merging,
void foo ();
void foo ();
should result in the flag still being used on the resulting function type,
but
void foo (void);
void foo ();
or
void foo ();
void foo (void);
shouldn't have it on the resulting function type.

Jakub



Re: [PATCH,LRA] Restrict the reuse of spill slots [PR117868]

2025-01-11 Thread Denis Chertykov
сб, 11 янв. 2025 г. в 22:15, Denis Chertykov :
> after LRA:
> --
> insn 543 r14 {r66} = [sfp+7]# reload insn
> insn 269 r14 {r66} ^= r2 {r67}
> insn 544 [sfp+24] = r14 {r66}   # reload insn -- bug is here !
>
> insn 545 r14 {r69} = [sfp+24]   # reload insn
> insn 547 r13 {r70} = [sfp+8]# reload insn
> insn 270 r14 {r69} ^= r13 {r70}
> insn 546 [sfp+25] = r14 {r69}   # reload insn
> --
>
> The bug appears in insn 544.
> It is a spill address `[sfp+24]' for pseudo r66 which is equal to another
> slot address from insn 544 for pseudo r69.

slot address from insn 545 for pseudo r69.

Denis.


Re: [PATCH] final: Fix get_attr_length for asm goto [PR118411]

2025-01-11 Thread Jeff Law




On 1/11/25 2:08 PM, Andrew Pinski (QUIC) wrote:

-Original Message-
From: Jeff Law 
Sent: Saturday, January 11, 2025 8:12 AM
To: Andrew Pinski (QUIC) ; gcc-
patc...@gcc.gnu.org
Subject: Re: [PATCH] final: Fix get_attr_length for asm goto
[PR118411]



On 1/11/25 2:52 AM, Andrew Pinski wrote:

The problem is for inline-asm goto, the outer rtl insn type is

a

jump_insn and get_attr_length does not handle ASM

specially unlike if

the outer rtl insn type was just insn.

This fixes the issue by adding support for both CALL_INSN

and

JUMP_INSN with asm.

OK? Bootstrapped and tested on x86_64-linux-gnu.

PR middle-end/118411

gcc/ChangeLog:

* final.cc (get_attr_length_1): Handle asm for CALL_INSN
and JUMP_INSNs.

OK


Can I apply this to both the GCC 14, 13 and 12 branches too? Even though this 
is not a regression; it seems like an important bug fix for asm goto. 
Especially since the Linux kernel uses it and might run into this issue.
Seems pretty safe to me and while not a regression, it's a case where we 
generate incorrect output.  So yea, fine by me.  Consider giving it a 
week or so on the trunk just in case...


jeff



RE: [PATCH] final: Fix get_attr_length for asm goto [PR118411]

2025-01-11 Thread Andrew Pinski (QUIC)
> -Original Message-
> From: Jeff Law 
> Sent: Saturday, January 11, 2025 8:12 AM
> To: Andrew Pinski (QUIC) ; gcc-
> patc...@gcc.gnu.org
> Subject: Re: [PATCH] final: Fix get_attr_length for asm goto
> [PR118411]
> 
> 
> 
> On 1/11/25 2:52 AM, Andrew Pinski wrote:
> > The problem is for inline-asm goto, the outer rtl insn type is
> a
> > jump_insn and get_attr_length does not handle ASM
> specially unlike if
> > the outer rtl insn type was just insn.
> >
> > This fixes the issue by adding support for both CALL_INSN
> and
> > JUMP_INSN with asm.
> >
> > OK? Bootstrapped and tested on x86_64-linux-gnu.
> >
> > PR middle-end/118411
> >
> > gcc/ChangeLog:
> >
> > * final.cc (get_attr_length_1): Handle asm for CALL_INSN
> > and JUMP_INSNs.
> OK

Can I apply this to both the GCC 14, 13 and 12 branches too? Even though this 
is not a regression; it seems like an important bug fix for asm goto. 
Especially since the Linux kernel uses it and might run into this issue.

Thanks,
Andrew

> jeff



Re: [PATCH] final: Fix get_attr_length for asm goto [PR118411]

2025-01-11 Thread Jakub Jelinek
On Sat, Jan 11, 2025 at 01:52:59AM -0800, Andrew Pinski wrote:
> The problem is for inline-asm goto, the outer rtl insn type
> is a jump_insn and get_attr_length does not handle ASM specially
> unlike if the outer rtl insn type was just insn.
> 
> This fixes the issue by adding support for both CALL_INSN and JUMP_INSN
> with asm.
> 
> OK? Bootstrapped and tested on x86_64-linux-gnu.
> 
>   PR middle-end/118411
> 
> gcc/ChangeLog:
> 
>   * final.cc (get_attr_length_1): Handle asm for CALL_INSN
>   and JUMP_INSNs.
> 
> Signed-off-by: Andrew Pinski 
> ---
>  gcc/final.cc | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/final.cc b/gcc/final.cc
> index 19c5d390c78..12c6eb0ac09 100644
> --- a/gcc/final.cc
> +++ b/gcc/final.cc
> @@ -363,7 +363,11 @@ get_attr_length_1 (rtx_insn *insn, int (*fallback_fn) 
> (rtx_insn *))
>  
>case CALL_INSN:
>case JUMP_INSN:
> - length = fallback_fn (insn);
> + body = PATTERN (insn);
> + if (GET_CODE (body) == ASM_INPUT || asm_noperands (body) >= 0)
> +   length = asm_insn_count (body) * fallback_fn (insn);
> + else
> +   length = fallback_fn (insn);
>   break;

A CALL_INSN can't be ever an inline asm, and ASM_INPUT (i.e. asm
("something");) can't be even JUMP_INSN.
So, shouldn't this be instead
  case JUMP_INSN:
if (asm_noperands (body) >= 0)
  {
length = asm_insn_count (body) * fallback_fn (insn);
break;
  }
/* FALLTHRU */
  case CALL_INSN:
length = fallback_fn (insn);
break;
?

Jakub