[committed] Fortran: In openmp.cc, uncomment unroll/tile lines of gfc_omp_directives

2025-01-26 Thread Tobias Burnus

This was seemingly forgotten when UNROLL/TILE was added.

Committed asr15-7220-g7cd133a6e4b042 as obvious. Tobias
commit 7cd133a6e4b04262620489dbf4b4e3ae5e96c95f
Author: Tobias Burnus 
Date:   Mon Jan 27 00:35:17 2025 +0100

Fortran: In openmp.cc, uncomment unroll/tile lines of gfc_omp_directives

Enable unroll and tile for assume's contains/absent clauses as both
directives are implemented since r15-1037-g804c0f35a6b1d7.

gcc/fortran/ChangeLog:

* openmp.cc (gfc_omp_directives): Uncomment unroll and tile lines
as the directives are by now implemented.
---
 gcc/fortran/openmp.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
index 7875341b2cf..35661d88f1e 100644
--- a/gcc/fortran/openmp.cc
+++ b/gcc/fortran/openmp.cc
@@ -109,8 +109,8 @@ static const struct gfc_omp_directive gfc_omp_directives[] = {
   {"task", GFC_OMP_DIR_EXECUTABLE, ST_OMP_TASK},
   {"teams", GFC_OMP_DIR_EXECUTABLE, ST_OMP_TEAMS},
   {"threadprivate", GFC_OMP_DIR_DECLARATIVE, ST_OMP_THREADPRIVATE},
-  /* {"tile", GFC_OMP_DIR_EXECUTABLE, ST_OMP_TILE}, */
-  /* {"unroll", GFC_OMP_DIR_EXECUTABLE, ST_OMP_UNROLL}, */
+  {"tile", GFC_OMP_DIR_EXECUTABLE, ST_OMP_TILE},
+  {"unroll", GFC_OMP_DIR_EXECUTABLE, ST_OMP_UNROLL},
   {"workshare", GFC_OMP_DIR_EXECUTABLE, ST_OMP_WORKSHARE},
 };
 


Re: [PATCH v6 4/6] OpenMP: Fortran support for metadirectives and dynamic selectors

2025-01-26 Thread Tobias Burnus

Hi Sandra,

this patch LGTM with some minor comments. Or rather:

I have a few minor comments that should be fixed right away
and a few larger items for which PRs should be filed.

See below.

Sandra Loosemore wrote:


gcc/fortran/ChangeLog
PR middle-end/112779
PR middle-end/113904
* decl.cc (gfc_match_end): Handle COMP_OMP_BEGIN_METADIRECTIVE and
COMP_OMP_METADIRECTIVE.
* dump-parse-tree.cc (show_omp_node): Handle EXEC_OMP_METADIRECTIVE.
(show_code_node): Likewise.
* gfortran.h (enum gfc_statement): Add ST_OMP_METADIRECTIVE,
ST_OMP_BEGIN_METADIRECTIVE, and ST_OMP_END_METADIRECTIVE.
(struct gfc_omp_clauses): Rename target_first_st_is_teams to
target_first_st_is_teams_or_meta.
(struct gfc_omp_variant): New.
(gfc_get_omp_variant): New.
(struct gfc_st_label): Add omp_region field.
(enum gfc_exec_op): Add EXEC_OMP_METADIRECTIVE.
(struct gfc_code): Add omp_variants fields.
(gfc_free_omp_variants): Declare.
(match_omp_directive): Declare.
(is_omp_declarative_stmt): Declare.
* io.cc (format_asterisk): Adjust initializer.
* match.h (gfc_match_omp_begin_metadirective): Declare.
(gfc_match_omp_metadirective): Declare.
* openmp.cc (gfc_match_omp_eos): Adjust to match context selectors.
(gfc_free_omp_variants): New.
(gfc_match_omp_clauses): Remove context_selector parameter and adjust
to use gfc_match_omp_eos instead.
(match_omp): Adjust call to gfc_match_omp_clauses.
(gfc_match_omp_context_selector): Add metadirective_p parameter and
adjust error-checking.  Adjust matching of simd clauses.
(gfc_match_omp_context_selector_specification): Adjust parameters
so it can be used for metadirective as well as declare variant.
(match_omp_metadirective): New.
(gfc_match_omp_begin_metadirective): New.
(gfc_match_omp_metadirective): New.
(resolve_omp_metadirective): New.
(resolve_omp_target): Handle metadirectives.
(gfc_resolve_omp_directive): Handle EXEC_OMP_METADIRECTIVE.
* parse.cc (gfc_matching_omp_context_selector): New.
(gfc_in_omp_metadirective_body): New.
(gfc_omp_region_count): New.
(decode_omp_directive): Handle ST_OMP_BEGIN_METADIRECTIVE and
ST_OMP_METADIRECTIVE.
(match_omp_directive): New.
(case_omp_structured_block): Define.
(case_omp_do): Define.
(gfc_ascii_statement): Handle ST_OMP_BEGIN_METADIRECTIVE,
ST_OMP_END_METADIRECTIVE, and ST_OMP_METADIRECTIVE.
(accept_statement):  Handle ST_OMP_METADIRECTIVE and
ST_OMP_BEGIN_METADIRECTIVE.
(gfc_omp_end_stmt): New, split from...
(parse_omp_do): ...here, and...
(parse_omp_structured_block): ...here.  Handle metadirectives.
(parse_omp_metadirective_body): New.
(parse_executable): Handle metadirective.  Use new case macros
defined above.
(gfc_parse_file): Initialize metadirective state.
(is_omp_declarative_stmt): New.
* parse.h (enum gfc_compile_state): Add COMP_OMP_METADIRECTIVE
and COMP_OMP_BEGIN_METADIRECTIVE.
(gfc_omp_end_stmt): Declare.
(gfc_matching_omp_context_selector): Declare.
(gfc_in_omp_metadirective_body): Declare.
(gfc_omp_metadirective_region_count): Declare.
* resolve.cc (gfc_resolve_code): Handle EXEC_OMP_METADIRECTIVE.
* st.cc (gfc_free_statement): Likewise.
* symbol.cc (compare_st_labels): Handle labels within a metadirective
body.
(gfc_get_st_label): Likewise.
* trans-decl.cc (gfc_get_label_decl): Encode the metadirective region
in the label_name.
* trans-openmp.cc (gfc_trans_omp_directive): Handle
EXEC_OMP_METADIRECTIVE.
(gfc_trans_omp_set_selector): New, split/adapted from code
(gfc_trans_omp_declare_variant): ...here.
(gfc_trans_omp_metadirective): New.
* trans-stmt.h  (gfc_trans_omp_metadirective): Declare.
* trans.cc (trans_code): Handle EXEC_OMP_METADIRECTIVE.

gcc/testsuite/ChangeLog
PR middle-end/112779
PR middle-end/113904
* gfortran.dg/gomp/metadirective-1.f90: New.
* gfortran.dg/gomp/metadirective-10.f90: New.
* gfortran.dg/gomp/metadirective-11.f90: New.
* gfortran.dg/gomp/metadirective-12.f90: New.
* gfortran.dg/gomp/metadirective-2.f90: New.
* gfortran.dg/gomp/metadirective-3.f90: New.
* gfortran.dg/gomp/metadirective-4.f90: New.
* gfortran.dg/gomp/metadirective-5.f90: New.
* gfortran.dg/gomp/metadirective-6.f90: New.
* gfortran.dg/gomp/metadirective-7.f90: New.
* gfortran.dg/gomp/metadirective-8.f90: New.
* gfortran.dg/gomp/metadirective-9.f90: New.
* gfortran.dg/gomp/metadirective-construct.f90: New.
* gfortran.dg

Re: [PATCH v2 7/7] Alpha: Add option to avoid data races for partial writes [PR117759]

2025-01-26 Thread Gaius Mulley
"Maciej W. Rozycki"  writes:

 ...
 
> There are notable regressions between a plain `-mno-bwx' configuration
> and a `-mno-bwx -msafe-partial' one:
>
> FAIL: gm2/iso/run/pass/strcons.mod execution,  -g
> FAIL: gm2/iso/run/pass/strcons.mod execution,  -O
> FAIL: gm2/iso/run/pass/strcons.mod execution,  -O -g
> FAIL: gm2/iso/run/pass/strcons.mod execution,  -Os
> FAIL: gm2/iso/run/pass/strcons.mod execution,  -O3 -fomit-frame-pointer
> FAIL: gm2/iso/run/pass/strcons.mod execution,  -O3 -fomit-frame-pointer 
> -finline-functions
> FAIL: gm2/iso/run/pass/strcons4.mod execution,  -g
> FAIL: gm2/iso/run/pass/strcons4.mod execution,  -O
> FAIL: gm2/iso/run/pass/strcons4.mod execution,  -O -g
> FAIL: gm2/iso/run/pass/strcons4.mod execution,  -Os
> FAIL: gm2/iso/run/pass/strcons4.mod execution,  -O3 -fomit-frame-pointer
> FAIL: gm2/iso/run/pass/strcons4.mod execution,  -O3 -fomit-frame-pointer 
> -finline-functions
>
> Just as with `-msafe-bwa' regressions they come from the fact that these 
> test cases end up calling code that expects a reference to aligned data 
> but is handed one to unaligned data, causing an alignment exception with 
> LDL_L or LDQ_L, which will eventually be fixed up by Linux.
>
> In some cases GCC chooses to open-code block memory write operations, so 
> with non-BWX targets `-msafe-partial' will in the usual case have to be 
> used together with `-msafe-bwa'.
>

I've logged PR 118600
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118600

and have an experimental proposed patch and changelog below.  In summary
the patch tests every assignment (of a constructor to a designator) to
ensure the types are GCC equivalent.  If they are equivalent then it
uses assignment and if not then it copies a structure by field and uses
strncpy to copy a string cst into an array.  I wonder if these changes
fix the regression test failures seen on Alpha above?

regards,
Gaius

--

PR modula2/118600 Assigning to a record causes alignment exception

This patch recursively tests every assignment (of a constructor
to a designator) to ensure the types are GCC equivalent.  If they
are equivalent then it uses gimple assignment and if not then it
copies a structure by field and uses __builtin_strncpy to copy a
string cst into an array.  Unions are copied by __builtin_memcpy.

gcc/m2/ChangeLog:

* gm2-compiler/M2GenGCC.mod (PerformCodeBecomes): New procedure.
(CodeBecomes): Refactor and call PerformCodeBecomes.
* gm2-gcc/m2builtins.cc (gm2_strncpy_node): New global variable.
(DoBuiltinStrNCopy): New function.
(m2builtins_BuiltinStrNCopy): New function.
(m2builtins_init): Initialize gm2_strncpy_node.
* gm2-gcc/m2builtins.def (BuiltinStrNCopy): New procedure
function.
* gm2-gcc/m2builtins.h (m2builtins_BuiltinStrNCopy): New
function.
* gm2-gcc/m2statement.cc (copy_record_fields): New function.
(copy_array): Ditto.
(copy_strncpy): Ditto.
(copy_memcpy): Ditto.
(CopyByField_Lower): Ditto.
(m2statement_CopyByField): Ditto.
* gm2-gcc/m2statement.def (CopyByField): New procedure function.
* gm2-gcc/m2statement.h (m2statement_CopyByField): New function.
* gm2-gcc/m2type.cc (check_record_fields): Ditto.
(check_array_types): Ditto.
(m2type_IsGccStrictTypeEquivalent): Ditto.
* gm2-gcc/m2type.def (IsGccStrictTypeEquivalent): New procedure
function.
* gm2-gcc/m2type.h (m2type_IsAddress): Replace return type int
with bool.

diff --git a/gcc/m2/gm2-compiler/M2GenGCC.mod b/gcc/m2/gm2-compiler/M2GenGCC.mod
index bba77ff12e1..912dfe7b8e8 100644
--- a/gcc/m2/gm2-compiler/M2GenGCC.mod
+++ b/gcc/m2/gm2-compiler/M2GenGCC.mod
@@ -43,7 +43,7 @@ FROM SymbolTable IMPORT PushSize, PopSize, PushValue, 
PopValue,
 IsConst, IsConstSet, IsProcedure, IsProcType,
 IsVar, IsVarParamAny, IsTemporary, IsTuple,
 IsEnumeration,
-IsUnbounded, IsArray, IsSet, IsConstructor,
+IsUnbounded, IsArray, IsSet, IsConstructor, 
IsConstructorConstant,
 IsProcedureVariable,
 IsUnboundedParamAny,
 IsRecordField, IsFieldVarient, IsVarient, IsRecord,
@@ -231,7 +231,7 @@ FROM m2statement IMPORT BuildAsm, BuildProcedureCallTree, 
BuildParam, BuildFunct
 BuildReturnValueCode, SetLastFunction,
 BuildIncludeVarConst, BuildIncludeVarVar,
 BuildExcludeVarConst, BuildExcludeVarVar,
-BuildBuiltinCallTree,
+BuildBuiltinCallTree, CopyByField,
GetParamTree, BuildCleanUp,
BuildTryFinally,
GetLastFunction, SetLastFunction,
@@ -240,7 +240,7 @@ FROM m2statement IMPORT BuildAsm, BuildProcedureCallTree, 
BuildParam, Bu

[PATCH] Fortran: fix bogus diagnostics on renamed interface import [PR110993]

2025-01-26 Thread Harald Anlauf

Dear all,

in the checking of imported interfaces we need to use the local names
of procedures that are renamed-on-use, as the original name becomes
inaccessible.  Similarly, we should not compare interfaces of
non-bind(C) procedures against bind(C) interfaces that are not
explicitly made accessible via a use statement, see testcase.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Could this one be backportable, e.g. to 14-branch?

Thanks,
Harald

From fb19a4bd29f49935514a7c2a43dbc9f2a6e9b147 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Sun, 26 Jan 2025 22:56:57 +0100
Subject: [PATCH] Fortran: fix bogus diagnostics on renamed interface import
 [PR110993]

	PR fortran/110993

gcc/fortran/ChangeLog:

	* frontend-passes.cc (check_externals_procedure): Do not compare
	interfaces of a non-bind(C) procedure against a bind(C) global one.
	(check_against_globals): Use local name from rename-on-use in the
	search for interfaces.

gcc/testsuite/ChangeLog:

	* gfortran.dg/use_rename_14.f90: New test.
---
 gcc/fortran/frontend-passes.cc  |  7 
 gcc/testsuite/gfortran.dg/use_rename_14.f90 | 46 +
 2 files changed, 53 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/use_rename_14.f90

diff --git a/gcc/fortran/frontend-passes.cc b/gcc/fortran/frontend-passes.cc
index 987238794da..6b470b83e21 100644
--- a/gcc/fortran/frontend-passes.cc
+++ b/gcc/fortran/frontend-passes.cc
@@ -5704,6 +5704,9 @@ check_externals_procedure (gfc_symbol *sym, locus *loc,
   if (gsym->ns)
 gfc_find_symbol (sym->name, gsym->ns, 0, &def_sym);
 
+  if (gsym->bind_c && def_sym && def_sym->binding_label == NULL)
+return 0;
+
   if (def_sym)
 {
   gfc_compare_actual_formal (&actual, def_sym->formal, 0, 0, 0, loc);
@@ -5800,6 +5803,10 @@ check_against_globals (gfc_symbol *sym)
 
   if (sym->binding_label)
 sym_name = sym->binding_label;
+  else if (sym->attr.use_rename
+	   && sym->ns->use_stmts->rename
+	   && sym->ns->use_stmts->rename->local_name[0] != '\0')
+sym_name = sym->ns->use_stmts->rename->local_name;
   else
 sym_name = sym->name;
 
diff --git a/gcc/testsuite/gfortran.dg/use_rename_14.f90 b/gcc/testsuite/gfortran.dg/use_rename_14.f90
new file mode 100644
index 000..03815a5f229
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/use_rename_14.f90
@@ -0,0 +1,46 @@
+! { dg-do compile }
+!
+! PR fortran/110993 - bogus diagnostics on renamed interface import
+!
+! Contributed by Rimvydas Jasinskas 
+
+module m
+  interface
+subroutine bar(x)
+  use iso_c_binding, only : c_float
+  implicit none
+  real(c_float) :: x(45)
+end subroutine
+  end interface
+end
+
+module m1
+  interface
+subroutine bar1(x) bind(c)
+  use iso_c_binding, only : c_float
+  implicit none
+  real(c_float) :: x(45)
+end subroutine
+  end interface
+end
+
+module m2
+  interface
+subroutine bar2(x) bind(c, name="bar2_")
+  use iso_c_binding, only : c_float
+  implicit none
+  real(c_float) :: x(45)
+end subroutine
+  end interface
+end
+
+subroutine foo(y)
+  use m,  notthisone => bar
+  use m1, northisone => bar1
+  use m2,  orthisone => bar2
+  implicit none
+  real :: y(3)
+  call bar (y)
+  call bar1(y)
+  call bar2(y)
+end subroutine
-- 
2.43.0



Re: [PATCH] Fortran: fix bogus diagnostics on renamed interface import [PR110993]

2025-01-26 Thread Jerry D

On 1/26/25 2:07 PM, Harald Anlauf wrote:

Dear all,

in the checking of imported interfaces we need to use the local names
of procedures that are renamed-on-use, as the original name becomes
inaccessible.  Similarly, we should not compare interfaces of
non-bind(C) procedures against bind(C) interfaces that are not
explicitly made accessible via a use statement, see testcase.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Could this one be backportable, e.g. to 14-branch?

Thanks,
Harald



This is OK. Backport up to you.

Jerry



[PATCH v1] RISC-V: Remove unnecessary frm restore volatile define_insn

2025-01-26 Thread pan2 . li
From: Pan Li 

After we add the frm register to the global_regs, we may not need to
define_insn that volatile to emit the frm restore insns.  The
cooperatively-managed global register will help to handle this, instead
of emit the volatile define_insn explicitly.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_emit_frm_mode_set): Refactor
the frm mode set by removing fsrmsi_restore_volatile.
* config/riscv/vector-iterators.md (unspecv): Remove as unnecessary.
* config/riscv/vector.md (fsrmsi_restore_volatile): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-dynamic-frm-49.c: Adjust
the asm dump check times.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-50.c: Ditto.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-52.c: Ditto.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-74.c: Ditto.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-75.c: Ditto.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv.cc | 43 ++-
 gcc/config/riscv/vector-iterators.md  |  4 --
 gcc/config/riscv/vector.md| 13 --
 .../rvv/base/float-point-dynamic-frm-49.c |  2 +-
 .../rvv/base/float-point-dynamic-frm-50.c |  2 +-
 .../rvv/base/float-point-dynamic-frm-52.c |  2 +-
 .../rvv/base/float-point-dynamic-frm-74.c |  2 +-
 .../rvv/base/float-point-dynamic-frm-75.c |  2 +-
 8 files changed, 28 insertions(+), 42 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index dd50fe4eddf..8e3bf0077cd 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -12031,27 +12031,30 @@ riscv_emit_frm_mode_set (int mode, int prev_mode)
   if (prev_mode == riscv_vector::FRM_DYN_CALL)
 emit_insn (gen_frrmsi (backup_reg)); /* Backup frm when DYN_CALL.  */
 
-  if (mode != prev_mode)
-{
-  rtx frm = gen_int_mode (mode, SImode);
-
-  if (mode == riscv_vector::FRM_DYN_CALL
-   && prev_mode != riscv_vector::FRM_DYN && STATIC_FRM_P (cfun))
-   /* No need to emit when prev mode is DYN already.  */
-   emit_insn (gen_fsrmsi_restore_volatile (backup_reg));
-  else if (mode == riscv_vector::FRM_DYN_EXIT && STATIC_FRM_P (cfun)
-   && prev_mode != riscv_vector::FRM_DYN
-   && prev_mode != riscv_vector::FRM_DYN_CALL)
-   /* No need to emit when prev mode is DYN or DYN_CALL already.  */
-   emit_insn (gen_fsrmsi_restore_volatile (backup_reg));
-  else if (mode == riscv_vector::FRM_DYN
-   && prev_mode != riscv_vector::FRM_DYN_CALL)
-   /* Restore frm value from backup when switch to DYN mode.  */
-   emit_insn (gen_fsrmsi_restore (backup_reg));
-  else if (riscv_static_frm_mode_p (mode))
-   /* Set frm value when switch to static mode.  */
-   emit_insn (gen_fsrmsi_restore (frm));
+  if (mode == prev_mode)
+return;
+
+  if (riscv_static_frm_mode_p (mode))
+{
+  /* Set frm value when switch to static mode.  */
+  emit_insn (gen_fsrmsi_restore (gen_int_mode (mode, SImode)));
+  return;
 }
+
+  bool restore_p
+= /* No need to emit when prev mode is DYN.  */
+  (STATIC_FRM_P (cfun) && mode == riscv_vector::FRM_DYN_CALL
+   && prev_mode != riscv_vector::FRM_DYN)
+  /* No need to emit if prev mode is DYN or DYN_CALL.  */
+  || (STATIC_FRM_P (cfun) && mode == riscv_vector::FRM_DYN_EXIT
+ && prev_mode != riscv_vector::FRM_DYN
+ && prev_mode != riscv_vector::FRM_DYN_CALL)
+  /* Restore frm value when switch to DYN mode.  */
+  || (mode == riscv_vector::FRM_DYN
+ && prev_mode != riscv_vector::FRM_DYN_CALL);
+
+  if (restore_p)
+emit_insn (gen_fsrmsi_restore (backup_reg));
 }
 
 /* Implement Mode switching.  */
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index c1bd7397441..f64e7ad70dd 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -122,10 +122,6 @@ (define_c_enum "unspec" [
   UNSPEC_SF_VFNRCLIPU
 ])
 
-(define_c_enum "unspecv" [
-  UNSPECV_FRM_RESTORE_EXIT
-])
-
 ;; Subset of VI with fractional LMUL types
 (define_mode_iterator VI_FRAC [
   RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index cf22b39d6cb..fe10eabeb2e 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -1116,19 +1116,6 @@ (define_insn "fsrmsi_restore"
(set_attr "mode" "SI")]
  )
 
-;; The volatile fsrmsi restore is used for the exit point for the
-;; dynamic mode switching. It will generate one volatile fsrm a5
-;; which won't be eliminated.
-(define_insn "fsrmsi_restore_volatile"
-  [(set (reg:SI FRM_REGNUM)
-   (unspec_volatile:SI [(match_operand:SI 0 "register_operand" "r")]
-   UNSPECV_FRM_RESTORE_EXIT))]
-  "TARGET_VECTOR"
-  "fsrm\t%0"
-  [(set_attr "type" "wrfrm")
-   (set_attr "mode"

Re: [PATCH] RISC-V: Disable two-source permutes for now [PR117173].

2025-01-26 Thread Jeff Law




On 1/24/25 3:57 AM, Robin Dapp wrote:

So this isn't a regression, but I can also understand the desire to fix
this fairly significant performance issue.


I'd argue it is a regression as the match.pd pattern that merges the permutes
was introduces after GCC 14.
Good point.  I hadn't thought about it as resolving the performance 
regression from that change.




After giving it a bit more thought, I'd still like to send the attached v2
because it excludes fewer cases and, consequently, requires fewer changes to
the test suite.

Regtested on rv64gcv_zvl512b.

Regards
  Robin

[PATCH v2] RISC-V: Disable two-source permutes for now [PR117173].

After testing on the BPI (4.2% improvement for x264 input 1, 4.4% for
input 2) and the discussion in PR117173 I figured it's best to disable
the two-source permutes by default for now.

The patch adds a parameter "riscv-two-source-permutes" which restores
the old behavior.

PR target/117173

gcc/ChangeLog:

* config/riscv/riscv-v.cc (shuffle_generic_patterns): Only
support single-source permutes by default.
* config/riscv/riscv.opt: New param "riscv-two-source-permutes".

gcc/testsuite/ChangeLog:

* gcc.dg/fold-perm-2.c: Run with two-source permutes.
* gcc.dg/pr54346.c: Ditto.

OK
jeff



Re: [PATCH v1] RISC-V: Remove unnecessary frm restore volatile define_insn

2025-01-26 Thread Jeff Law




On 1/26/25 6:33 AM, pan2...@intel.com wrote:

From: Pan Li 

After we add the frm register to the global_regs, we may not need to
define_insn that volatile to emit the frm restore insns.  The
cooperatively-managed global register will help to handle this, instead
of emit the volatile define_insn explicitly.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_emit_frm_mode_set): Refactor
the frm mode set by removing fsrmsi_restore_volatile.
* config/riscv/vector-iterators.md (unspecv): Remove as unnecessary.
* config/riscv/vector.md (fsrmsi_restore_volatile): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-dynamic-frm-49.c: Adjust
the asm dump check times.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-50.c: Ditto.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-52.c: Ditto.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-74.c: Ditto.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-75.c: Ditto.

It's a nice cleanup, but let's defer since it doesn't fix a bug.

jeff



RE: [PATCH v1] RISC-V: Remove unnecessary frm restore volatile define_insn

2025-01-26 Thread Li, Pan2
> It's a nice cleanup, but let's defer since it doesn't fix a bug.

Sure thing, will defer to gcc-16.

Pan

-Original Message-
From: Jeff Law  
Sent: Sunday, January 26, 2025 9:34 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; rdapp@gmail.com; 
vine...@rivosinc.com
Subject: Re: [PATCH v1] RISC-V: Remove unnecessary frm restore volatile 
define_insn



On 1/26/25 6:33 AM, pan2...@intel.com wrote:
> From: Pan Li 
> 
> After we add the frm register to the global_regs, we may not need to
> define_insn that volatile to emit the frm restore insns.  The
> cooperatively-managed global register will help to handle this, instead
> of emit the volatile define_insn explicitly.
> 
> gcc/ChangeLog:
> 
>   * config/riscv/riscv.cc (riscv_emit_frm_mode_set): Refactor
>   the frm mode set by removing fsrmsi_restore_volatile.
>   * config/riscv/vector-iterators.md (unspecv): Remove as unnecessary.
>   * config/riscv/vector.md (fsrmsi_restore_volatile): Ditto.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/riscv/rvv/base/float-point-dynamic-frm-49.c: Adjust
>   the asm dump check times.
>   * gcc.target/riscv/rvv/base/float-point-dynamic-frm-50.c: Ditto.
>   * gcc.target/riscv/rvv/base/float-point-dynamic-frm-52.c: Ditto.
>   * gcc.target/riscv/rvv/base/float-point-dynamic-frm-74.c: Ditto.
>   * gcc.target/riscv/rvv/base/float-point-dynamic-frm-75.c: Ditto.
It's a nice cleanup, but let's defer since it doesn't fix a bug.

jeff



[RFC PATCH] i386: Re-alias -mavx10.2 to 512 bit and make -mno-avx10.x-512 disable the whole AVX10.x

2025-01-26 Thread Haochen Jiang
Hi all,

AVX10 has been published for one and half year and we have got many feedbacks
on that, one of the feedback is on whether the alias option -mavx10.x should
point to 256 or 512.

If you also pay attention to LLVM community, you might see this thread related
to AVX10 options just sent out several hours ago:

[X86][AVX10] Disable m[no-]avx10.1 and switch m[no-]avx10.2 to alias of 512 bit 
options
https://github.com/llvm/llvm-project/pull/124511

In GCC, we will also do so. This RFC patch is slightly different with LLVM, just
including:

  - Switch -m[no-]avx10.2 to alias of 512 bit options.
  - Change -mno-avx10.[1,2]-512 to disable both 256 and 512 instructions. This
  will also result in -mno-avx10.2 would still disable both 256 and 512 insts
  according to new alias point to 512.

But not including disabling -m[no-]avx10.1, since I still want more input on
how to handle that. We actually have three choices on that:

 a. Directly re-alias -m[no-]avx10.1 to -m[no-]avx10.1-512 GCC 15 and backport
 to GCC 14.
 b. Disable -m[no]-avx10.1 in GCC 15, and add it back with -m[no-]avx10.1-512
 in the future. This is for in case if someone cross compile with different 
versions
 of GCC with -mavx10.1, it might get unexpected result sliently.
 c. Disable -m[no]-avx10.1 in GCC 15, and never add it back. Since the option 
has
 been 256 bit, changing them back and forth is messy.

It might be the final chance we could change the alias option since real
AVX10.1 hardware is coming soon. And it is only x86 specific, so it might still
squeeze into GCC 15 at this time.

I call this patch RFC patch since we also need to change the doc and testcases
accordingly, which makes this patch incomplete. Discussion and input is welcomed
on this topic.

Thx,
Haochen

---
 gcc/common/config/i386/i386-common.cc | 30 +--
 gcc/common/config/i386/i386-isas.h|  2 +-
 gcc/config/i386/i386-options.cc   |  2 +-
 gcc/config/i386/i386.opt  |  4 ++--
 gcc/doc/extend.texi   |  8 ---
 gcc/doc/sourcebuild.texi  |  4 ++--
 6 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/gcc/common/config/i386/i386-common.cc 
b/gcc/common/config/i386/i386-common.cc
index 52ad1c5acd1..3891fca8ecb 100644
--- a/gcc/common/config/i386/i386-common.cc
+++ b/gcc/common/config/i386/i386-common.cc
@@ -325,14 +325,12 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA2_APX_F_UNSET OPTION_MASK_ISA2_APX_F
 #define OPTION_MASK_ISA2_EVEX512_UNSET OPTION_MASK_ISA2_EVEX512
 #define OPTION_MASK_ISA2_USER_MSR_UNSET OPTION_MASK_ISA2_USER_MSR
-#define OPTION_MASK_ISA2_AVX10_1_256_UNSET \
-  (OPTION_MASK_ISA2_AVX10_1_256 | OPTION_MASK_ISA2_AVX10_1_512_UNSET \
-   | OPTION_MASK_ISA2_AVX10_2_256_UNSET)
-#define OPTION_MASK_ISA2_AVX10_1_512_UNSET \
-  (OPTION_MASK_ISA2_AVX10_1_512 | OPTION_MASK_ISA2_AVX10_2_512_UNSET)
-#define OPTION_MASK_ISA2_AVX10_2_256_UNSET OPTION_MASK_ISA2_AVX10_2_256
-#define OPTION_MASK_ISA2_AVX10_2_512_UNSET \
-  (OPTION_MASK_ISA2_AVX10_2_512 | OPTION_MASK_ISA2_AMX_AVX512_UNSET)
+#define OPTION_MASK_ISA2_AVX10_1_UNSET \
+  (OPTION_MASK_ISA2_AVX10_1_256 | OPTION_MASK_ISA2_AVX10_1_512 \
+   | OPTION_MASK_ISA2_AVX10_2_UNSET)
+#define OPTION_MASK_ISA2_AVX10_2_UNSET \
+  (OPTION_MASK_ISA2_AVX10_2_256 | OPTION_MASK_ISA2_AVX10_2_512 \
+   OPTION_MASK_ISA2_AMX_AVX512_UNSET)
 #define OPTION_MASK_ISA2_AMX_AVX512_UNSET OPTION_MASK_ISA2_AMX_AVX512
 #define OPTION_MASK_ISA2_AMX_TF32_UNSET OPTION_MASK_ISA2_AMX_TF32
 #define OPTION_MASK_ISA2_AMX_TRANSPOSE_UNSET OPTION_MASK_ISA2_AMX_TRANSPOSE
@@ -1378,8 +1376,8 @@ ix86_handle_option (struct gcc_options *opts,
}
   else
{
- opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_AVX10_1_256_UNSET;
- opts->x_ix86_isa_flags2_explicit |= 
OPTION_MASK_ISA2_AVX10_1_256_UNSET;
+ opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_AVX10_1_UNSET;
+ opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX10_1_UNSET;
  opts->x_ix86_no_avx10_1_explicit = 1;
}
   return true;
@@ -1394,8 +1392,8 @@ ix86_handle_option (struct gcc_options *opts,
}
   else
{
- opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_AVX10_1_512_UNSET;
- opts->x_ix86_isa_flags2_explicit |= 
OPTION_MASK_ISA2_AVX10_1_512_UNSET;
+ opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_AVX10_1_UNSET;
+ opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX10_1_UNSET;
  opts->x_ix86_no_avx10_1_explicit = 1;
}
   return true;
@@ -1410,8 +1408,8 @@ ix86_handle_option (struct gcc_options *opts,
}
   else
{
- opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_AVX10_2_256_UNSET;
- opts->x_ix86_isa_flags2_explicit |= 
OPTION_MASK_ISA2_AVX10_2_256_UNSET;
+ opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_AVX10_2_UNSET;
+ opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX10_2_UNSET;
}
   return

RE: [PATCH v2 1/4] RISC-V: Refactor SAT_* operand rtx extend to reg help func [NFC]

2025-01-26 Thread Li, Pan2
Thanks Jeff, I will resolve the conflict and send v3 after test.

Pan

-Original Message-
From: Jeff Law  
Sent: Monday, January 27, 2025 12:38 AM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; rdapp@gmail.com
Subject: Re: [PATCH v2 1/4] RISC-V: Refactor SAT_* operand rtx extend to reg 
help func [NFC]



On 1/23/25 12:01 AM, pan2...@intel.com wrote:
> From: Pan Li 
> 
> This patch would like to refactor the helper function of the SAT_*
> scalar.  The helper function will convert the define_pattern ops
> to the xmode reg for the underlying code-gen.  This patch add
> new parameter for ZERO_EXTEND or SIGN_EXTEND if the input is const_int
> or the mode is non-Xmode.
> 
> The below test suites are passed for this patch.
> * The rv64gcv fully regression test.
> 
> gcc/ChangeLog:
> 
>   * config/riscv/riscv.cc (riscv_gen_zero_extend_rtx): Rename from ...
>   (riscv_extend_to_xmode_reg): Rename to and add rtx_code for
>   zero/sign extend if non-Xmode.
>   (riscv_expand_usadd): Leverage the renamed function with ZERO_EXTEND.
>   (riscv_expand_ussub): Ditto.
Note that I recently made a small change to riscv_gen_zero_extend_rtx 
that I think you need to incorporate into your patch.  Otherwise I think 
you'll get a code quality regression on some of the saturation tests.

Combine purposefully doesn't try to simplify expressions like
(zero_extend (const_int ...)) or (sign_extend (const_int ...)).  It's a 
historical wart, probably related to the lack of a mode on const_int 
objects IIRC.

As a result if you ask the old code for an SImode 0x8000 you get a 
load of 0x8000 (lui) followed by a zero extend (two shifts). 
  What you really want is a li to load 0x1, then a left shift. to 
produce 0x8000.  This problem had been previously masked by the 
mvconst_internal pattern.

The way to get this behavior is to take the incoming constant and mask 
off all the bits outside the desired mode.  Then use gen_int_mode to 
actually generate a canonical const_int.  Then force that into a 
register with force_reg.  ie:

>   /* Combine deliberately does not simplify extensions of constants
>  (long story).  So try to generate the zero extended constant
>  efficiently.
>   
>  First extract the constant and mask off all the bits not in MODE.  */
>   HOST_WIDE_INT val = INTVAL (x);
>   val &= GET_MODE_MASK (mode);
>   
>   /* X may need synthesis, so do not blindly copy it.  */
>   xmode_reg = force_reg (Xmode, gen_int_mode (val, Xmode));

I think the upstream ci system hasn't moved the baseline forward in 
about a week.   As a result it's not reporting the failure to apply due 
to the conflict nor is it reporting the code quality regression.



Jeff


[PATCH] libstdc++: Fix localized D_T_FMT %c formatting for [PR117214]

2025-01-26 Thread XU Kailiang
Formatting a time point with %c was implemented by calling
std::vprint_to with format string constructed from locale's D_T_FMT
string, but in some locales this string does not compliant to
chrono-specs. So just use _M_locale_fmt to avoid this problem.

libstdc++-v3/ChangeLog:

PR libstdc++/117214
* include/bits/chrono_io.h (__formatter_chrono::_M_c): use
_M_locale_fmt to format %c time point.
* testsuite/std/time/format/pr117214.cc: New test.

Signed-off-by: XU Kailiang 
---
 libstdc++-v3/include/bits/chrono_io.h | 35 ++-
 .../testsuite/std/time/format/pr117214.cc | 32 +
 2 files changed, 51 insertions(+), 16 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/std/time/format/pr117214.cc

diff --git a/libstdc++-v3/include/bits/chrono_io.h b/libstdc++-
v3/include/bits/chrono_io.h
index 6c813bf439d..9a4fa153a98 100644
--- a/libstdc++-v3/include/bits/chrono_io.h
+++ b/libstdc++-v3/include/bits/chrono_io.h
@@ -787,27 +787,30 @@ namespace __format
 
   template
    typename _FormatContext::iterator
-   _M_c(const _Tp& __tt, typename _FormatContext::iterator __out,
+   _M_c(const _Tp& __t, typename _FormatContext::iterator __out,
     _FormatContext& __ctx, bool __mod = false) const
    {
      // %c  Locale's date and time representation.
      // %Ec Locale's alternate date and time representation.
 
-     basic_string<_CharT> __fmt;
-     auto __t = _S_floor_seconds(__tt);
-     locale __loc = _M_locale(__ctx);
-     const auto& __tp = use_facet<__timepunct<_CharT>>(__loc);
-     const _CharT* __formats[2];
-     __tp._M_date_time_formats(__formats);
-     if (*__formats[__mod]) [[likely]]
-       {
-     __fmt = _GLIBCXX_WIDEN("{:L}");
-     __fmt.insert(3u, __formats[__mod]);
-       }
-     else
-       __fmt = _GLIBCXX_WIDEN("{:L%a %b %e %T %Y}");
-     return std::vformat_to(std::move(__out), __loc, __fmt,
-   
std::make_format_args<_FormatContext>(__t));
+     using namespace chrono;
+     auto __d = _S_days(__t);
+     using _TDays = decltype(__d);
+     const auto __ymd = _S_date(__d);
+     const auto __y = __ymd.year();
+     const auto __hms = _S_hms(__t);
+
+     struct tm __tm{};
+     __tm.tm_year = (int)__y - 1900;
+     __tm.tm_yday = (__d - _TDays(__y/January/1)).count();
+     __tm.tm_mon = (unsigned)_S_month(__t) - 1;
+     __tm.tm_mday = (unsigned)_S_day(__t);
+     __tm.tm_wday = _S_weekday(__t).c_encoding();
+     __tm.tm_hour = __hms.hours().count();
+     __tm.tm_min = __hms.minutes().count();
+     __tm.tm_sec = __hms.seconds().count();
+     return _M_locale_fmt(std::move(__out), _M_locale(__ctx),
__tm, 'c',
+      __mod ? 'E' : '\0');
    }
 
   template
diff --git a/libstdc++-v3/testsuite/std/time/format/pr117214.cc
b/libstdc++-v3/testsuite/std/time/format/pr117214.cc
new file mode 100644
index 000..5b36edadfa0
--- /dev/null
+++ b/libstdc++-v3/testsuite/std/time/format/pr117214.cc
@@ -0,0 +1,32 @@
+// { dg-do run { target c++20 } }
+// { dg-require-namedlocale "aa_DJ.UTF-8" }
+// { dg-require-namedlocale "ar_SA.UTF-8" }
+// { dg-require-namedlocale "ca_AD.UTF-8" }
+// { dg-require-namedlocale "az_IR.UTF-8" }
+// { dg-require-namedlocale "my_MM.UTF-8" }
+
+#include 
+#include 
+#include 
+
+void
+test_c()
+{
+  const char *test_locales[] = {
+    "aa_DJ.UTF-8",
+    "ar_SA.UTF-8",
+    "ca_AD.UTF-8",
+    "az_IR.UTF-8",
+    "my_MM.UTF-8",
+  };
+  for (auto locale_name : test_locales)
+  {
+    std::locale::global(std::locale(locale_name));
+    VERIFY( !std::format("{:L%c}", std::chrono::sys_seconds()).empty()
);
+  }
+}
+
+int main()
+{
+  test_c();
+}



Re: [PATCH v3 0/4] Hard Register Constraints

2025-01-26 Thread Jeff Law




On 1/23/25 8:49 AM, Stefan Schulze Frielinghaus wrote:

On Sat, Jan 18, 2025 at 09:36:14AM -0700, Jeff Law wrote:
[...]

Do we detect conflicts between a hard register constraint and another
constraint which requires a singleton class?  That's going to be an error I
suspect, but curious if it's handled.


That is a good point.  Currently I suspect no.  I will have a look.

Thanks.  It's not the most important thing on our plate, but given the way
x86 is structured we probably need to do something sensible here.

I also worry a bit about non-singleton classes that the target may have
added to CLASS_LIKELY_SPILLED_P, though unlike the singleton case, there's
at least a chance these will work, albeit potentially generating poor code
when an object needs spilling.  I also don't think it's terribly common to
add non-singleton classes to that set.


I was first worried that the single register class construct is somewhat
special.  To me, it turns out that they behave very similar to my
current draft.  Basically during LRA in process_alt_operands() I'm
installing
Yea, I would think they'd largely behave like your proposal. Given the 
presence of a singleton class one could use that to write an ASM just as 
effectively as your hard register proposal.  And satisfying the 
constraints should boil down to the same basic process.


What your proposal does is give users fine grained control as-if the 
port had a singleton class for every register -- without us having to 
add all those pesky register classes.


[ ... ]



(I have tested those only on x86_64 so far but I expect them to work on
32-bit, too, module int128)

I will include those, and of course, similar ones for constraints
b,c,d,S,D in a future patch revision.  If there is any other target with
non-ordinary register classes/constraints/whatnot just let me know and I
will have a look.

Thanks for pulling together some tests around that.

armv7 might be interesting to play with, though I suspect it'll just 
work.  It adds LO_REGS to the likely spilled classes when thumb is 
enabled.  LO_REGS isn't a singleton class.  So it may be worth a quick 
test on that target.  A few others do similar things (arm, pru, etc). 
But again, I think it'll just work since it sounds like singletons 
already work.


Jeff


Re: [PATCH] RISC-V: ensure needed FRM restore is not eliminable [PR118646]

2025-01-26 Thread Jeff Law




On 1/24/25 3:12 PM, Vineet Gupta wrote:

RV-Vector FP-INT insns use the rounding mode in FRM register which if
explicitly set for V insn needs, is saved/restored (although from the
psABI CC Spec, it is not clear if it actually a caller-saved or
callee-saved).

Anyhow in the failure case the save/restore were generated by the
Mode Switch pass, but then eliminated by sched1:DCE and Late-Combine.
Fix this by using unspec_volatile variant which won't be eliminated.

This showed up as SPEC2017 527.cam4 runtime aborts in glibc:round_away()
which checks for standard rounding modes and the "leaking" rounding mode
due to the bug happened to be a non-standard RISC-V specific RMM
"Round to Nearest, ties to Max".

This is testsuite clean:

Not sure how it could be clean as I think the test itself is busted ;-)

As-is it'll trigger compile time failures:

FAIL: gfortran.target/riscv/rvv/pr118646.f90   -O0  (test for excess errors)
Excess errors:
/home/jlaw/test/gcc/gcc/testsuite/gfortran.target/riscv/rvv/pr118646.f90:18:12: 
Warning: Deleted feature: End expression in DO loop at (1) must be integer
/home/jlaw/test/gcc/gcc/testsuite/gfortran.target/riscv/rvv/pr118646.f90:22:15: 
Warning: Deleted feature: End expression in DO loop at (1) must be integer
/home/jlaw/test/gcc/gcc/testsuite/gfortran.target/riscv/rvv/pr118646.f90:36:18: 
Warning: Deleted feature: End expression in DO loop at (1) must be integer



A "-w" will work around that, but then there's no Fortran equivalent of 
main and we get a link error:

FAIL: gfortran.target/riscv/rvv/pr118646.f90   -O0  (test for excess errors)
Excess errors:
/release/linux/.build/src/glibc-git-4d29ec7c/csu/../sysdeps/riscv/start.S:67:(.text+0x22):
 undefined reference to `main'

UNRESOLVED: gfortran.target/riscv/rvv/pr118646.f90   -O0  compilation failed to 
produce executable



What I wanted to do was use your testcase with Pan's patch to see if 
Pan's patch resolved both issues.


Your compiler patch may still be desirable as well, I really haven't 
really evaluated that yet.


jeff

ps.  Pre-commit also failed on the new test:

https://github.com/ewlu/gcc-precommit-ci/issues/3051#issuecomment-2613516130








Re: [PATCH v3 0/4] Hard Register Constraints

2025-01-26 Thread Jakub Jelinek
On Sun, Jan 26, 2025 at 08:35:29AM -0700, Jeff Law wrote:
> On 1/23/25 8:49 AM, Stefan Schulze Frielinghaus wrote:
> > On Sat, Jan 18, 2025 at 09:36:14AM -0700, Jeff Law wrote:
> > [...]
> > > > > Do we detect conflicts between a hard register constraint and another
> > > > > constraint which requires a singleton class?  That's going to be an 
> > > > > error I
> > > > > suspect, but curious if it's handled.
> > > > 
> > > > That is a good point.  Currently I suspect no.  I will have a look.
> > > Thanks.  It's not the most important thing on our plate, but given the way
> > > x86 is structured we probably need to do something sensible here.
> > > 
> > > I also worry a bit about non-singleton classes that the target may have
> > > added to CLASS_LIKELY_SPILLED_P, though unlike the singleton case, there's
> > > at least a chance these will work, albeit potentially generating poor code
> > > when an object needs spilling.  I also don't think it's terribly common to
> > > add non-singleton classes to that set.
> > 
> > I was first worried that the single register class construct is somewhat
> > special.  To me, it turns out that they behave very similar to my
> > current draft.  Basically during LRA in process_alt_operands() I'm
> > installing
> Yea, I would think they'd largely behave like your proposal. Given the
> presence of a singleton class one could use that to write an ASM just as
> effectively as your hard register proposal.  And satisfying the constraints
> should boil down to the same basic process.
> 
> What your proposal does is give users fine grained control as-if the port
> had a singleton class for every register -- without us having to add all
> those pesky register classes.

Though, don't we have various hacks for small register classes?
I mean e.g. targetm.class_likely_spilled_p calls in
combine/cse/loop-invariant etc.
If all the registers could be made to behave similarly, don't we need to
create those similarly?

Jakub



[PATCH] c++, v3: Implement for namespace statics CWG 2867 - Order of initialization for structured bindings [PR115769]

2025-01-26 Thread Jakub Jelinek
On Sat, Jan 25, 2025 at 10:53:50AM -0500, Jason Merrill wrote:
> On 1/25/25 4:12 AM, Jakub Jelinek wrote:
> > On Fri, Jan 24, 2025 at 07:07:15PM -0500, Jason Merrill wrote:
> > > Hypothetically, but those cases are just either error or DECL_EXTERNAL. In
> > > the error case we're failing anyway; in the external case all the
> > > base/nonbase for a particular structured binding declaration should be
> > > consistent.
> > 
> > So shall I just remove all the prune_vars_needing_no_initialization hunks
> > then or add gcc_checking_assert (!STATIC_INIT_DECOMP_BASE_P (t) &&
> > !STATIC_INIT_DECOMP_NONBASE_P (t)); for the DECL_EXTERNAL punt case?
> 
> Just the assert sounds good.
> 
> > > > Note, unfortunately it is hard to come up with a testcase that actually
> > > > prunes something on purpose...
> > > 
> > > Indeed, it shouldn't be possible.

So like this?

Passed x86_64-linux and i686-linux bootstrap/regtest.

2025-01-26  Jakub Jelinek  

PR c++/115769
gcc/cp/
* cp-tree.h (STATIC_INIT_DECOMP_BASE_P): Define.
(STATIC_INIT_DECOMP_NONBASE_P): Define.
* decl.cc (cp_finish_decl): Mark nodes in {static,tls}_aggregates
with 
* decl2.cc (decomp_handle_one_var, decomp_finalize_var_list): New
functions.
(emit_partial_init_fini_fn): Use them.
(prune_vars_needing_no_initialization): Assert
STATIC_INIT_DECOMP_*BASE_P is not set on DECL_EXTERNAL vars to be
pruned out.
(partition_vars_for_init_fini): Use same priority for
consecutive STATIC_INIT_DECOMP_*BASE_P vars and propagate
those flags to new TREE_LISTs when possible.  Formatting fix.
(handle_tls_init): Use decomp_handle_one_var and
decomp_finalize_var_list functions.
gcc/testsuite/
* g++.dg/DRs/dr2867-5.C: New test.
* g++.dg/DRs/dr2867-6.C: New test.
* g++.dg/DRs/dr2867-7.C: New test.
* g++.dg/DRs/dr2867-8.C: New test.

--- gcc/cp/cp-tree.h.jj 2024-09-07 09:31:20.601484156 +0200
+++ gcc/cp/cp-tree.h2024-09-09 15:53:44.924112247 +0200
@@ -470,6 +470,7 @@ extern GTY(()) tree cp_global_trees[CPTI
   BASELINK_FUNCTIONS_MAYBE_INCOMPLETE_P (in BASELINK)
   BIND_EXPR_VEC_DTOR (in BIND_EXPR)
   ATOMIC_CONSTR_EXPR_FROM_CONCEPT_P (in ATOMIC_CONSTR)
+  STATIC_INIT_DECOMP_BASE_P (in the TREE_LIST for {static,tls}_aggregates)
2: IDENTIFIER_KIND_BIT_2 (in IDENTIFIER_NODE)
   ICS_THIS_FLAG (in _CONV)
   DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P (in VAR_DECL)
@@ -489,6 +490,8 @@ extern GTY(()) tree cp_global_trees[CPTI
   IMPLICIT_CONV_EXPR_BRACED_INIT (in IMPLICIT_CONV_EXPR)
   PACK_EXPANSION_AUTO_P (in *_PACK_EXPANSION)
   contract_semantic (in ASSERTION_, PRECONDITION_, POSTCONDITION_STMT)
+  STATIC_INIT_DECOMP_NONBASE_P (in the TREE_LIST
+   for {static,tls}_aggregates)
3: IMPLICIT_RVALUE_P (in NON_LVALUE_EXPR or STATIC_CAST_EXPR)
   ICS_BAD_FLAG (in _CONV)
   FN_TRY_BLOCK_P (in TRY_BLOCK)
@@ -5947,6 +5950,21 @@ extern bool defer_mangling_aliases;
 
 extern bool flag_noexcept_type;
 
+/* True if this TREE_LIST in {static,tls}_aggregates is a for dynamic
+   initialization of namespace scope structured binding base or related
+   extended ref init temps.  Temporaries from the initialization of
+   STATIC_INIT_DECOMP_BASE_P dynamic initializers should be destroyed only
+   after the last STATIC_INIT_DECOMP_NONBASE_P dynamic initializer following
+   it.  */
+#define STATIC_INIT_DECOMP_BASE_P(NODE) \
+  TREE_LANG_FLAG_1 (TREE_LIST_CHECK (NODE))
+
+/* True if this TREE_LIST in {static,tls}_aggregates is a for dynamic
+   initialization of namespace scope structured binding non-base
+   variable using get.  */
+#define STATIC_INIT_DECOMP_NONBASE_P(NODE) \
+  TREE_LANG_FLAG_2 (TREE_LIST_CHECK (NODE))
+
 /* A list of namespace-scope objects which have constructors or
destructors which reside in the global scope.  The decl is stored
in the TREE_VALUE slot and the initializer is stored in the
--- gcc/cp/decl.cc.jj   2024-09-09 11:50:07.146394047 +0200
+++ gcc/cp/decl.cc  2024-09-09 17:16:26.459094150 +0200
@@ -8485,6 +8485,7 @@ cp_finish_decl (tree decl, tree init, bo
   bool var_definition_p = false;
   tree auto_node;
   auto_vec extra_cleanups;
+  tree aggregates1 = NULL_TREE;
   struct decomp_cleanup {
 tree decl;
 cp_decomp *&decomp;
@@ -8872,7 +8873,16 @@ cp_finish_decl (tree decl, tree init, bo
}
 
   if (decomp)
-   cp_maybe_mangle_decomp (decl, decomp);
+   {
+ cp_maybe_mangle_decomp (decl, decomp);
+ if (TREE_STATIC (decl) && !DECL_FUNCTION_SCOPE_P (decl))
+   {
+ if (CP_DECL_THREAD_LOCAL_P (decl))
+   aggregates1 = tls_aggregates;
+ else
+   aggregates1 = static_aggregates;
+   }
+   }
 
   /* If this is a local variable that will need a mangled name,
 register it now.  We must do this before pro

Re: [PATCH v2 2/4] RISC-V: Fix incorrect code gen for scalar signed SAT_ADD [PR117688]

2025-01-26 Thread Jeff Law




On 1/23/25 12:01 AM, pan2...@intel.com wrote:

From: Pan Li 

This patch would like to fix the wroing code generation for the scalar
signed SAT_ADD.  The input can be QI/HI/SI/DI while the alu like sub
can only work on Xmode.  Unfortunately we don't have sub/add for
non-Xmode like QImode in scalar, thus we need to sign extend to Xmode
to ensure we have the correct value before ALU like add.  The gen_lowpart
will generate something like lbu which has all zero for highest bits.

For example, when 0xff(-1 for QImode) plus 0x2(1 for QImode), we actually
want to -1 + 2 = 1, but if there is no sign extend like lbu, we will get
0xff + 2 = 0x101 which is incorrect.  Thus, we have to sign extend 0xff(Qmode)
to 0x(assume XImode is DImode) before plus in Xmode.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

PR target/117688

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_expand_ssadd): Leverage the helper
riscv_extend_to_xmode_reg with SIGN_EXTEND.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/pr117688-add-run-1-s16.c: New test.
* gcc.target/riscv/pr117688-add-run-1-s32.c: New test.
* gcc.target/riscv/pr117688-add-run-1-s64.c: New test.
* gcc.target/riscv/pr117688-add-run-1-s8.c: New test.
* gcc.target/riscv/pr117688.h: New test.
Conceptually OK.  We just need to get the helper fixed up properly, then 
retest before committing.


jeff



Re: [PATCH v2 1/4] RISC-V: Refactor SAT_* operand rtx extend to reg help func [NFC]

2025-01-26 Thread Jeff Law




On 1/23/25 12:01 AM, pan2...@intel.com wrote:

From: Pan Li 

This patch would like to refactor the helper function of the SAT_*
scalar.  The helper function will convert the define_pattern ops
to the xmode reg for the underlying code-gen.  This patch add
new parameter for ZERO_EXTEND or SIGN_EXTEND if the input is const_int
or the mode is non-Xmode.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_gen_zero_extend_rtx): Rename from ...
(riscv_extend_to_xmode_reg): Rename to and add rtx_code for
zero/sign extend if non-Xmode.
(riscv_expand_usadd): Leverage the renamed function with ZERO_EXTEND.
(riscv_expand_ussub): Ditto.
Note that I recently made a small change to riscv_gen_zero_extend_rtx 
that I think you need to incorporate into your patch.  Otherwise I think 
you'll get a code quality regression on some of the saturation tests.


Combine purposefully doesn't try to simplify expressions like
(zero_extend (const_int ...)) or (sign_extend (const_int ...)).  It's a 
historical wart, probably related to the lack of a mode on const_int 
objects IIRC.


As a result if you ask the old code for an SImode 0x8000 you get a 
load of 0x8000 (lui) followed by a zero extend (two shifts). 
 What you really want is a li to load 0x1, then a left shift. to 
produce 0x8000.  This problem had been previously masked by the 
mvconst_internal pattern.


The way to get this behavior is to take the incoming constant and mask 
off all the bits outside the desired mode.  Then use gen_int_mode to 
actually generate a canonical const_int.  Then force that into a 
register with force_reg.  ie:



  /* Combine deliberately does not simplify extensions of constants
 (long story).  So try to generate the zero extended constant
 efficiently.
  
 First extract the constant and mask off all the bits not in MODE.  */

  HOST_WIDE_INT val = INTVAL (x);
  val &= GET_MODE_MASK (mode);
  
  /* X may need synthesis, so do not blindly copy it.  */

  xmode_reg = force_reg (Xmode, gen_int_mode (val, Xmode));


I think the upstream ci system hasn't moved the baseline forward in 
about a week.   As a result it's not reporting the failure to apply due 
to the conflict nor is it reporting the code quality regression.




Jeff


Re: [PATCH v2 3/4] RISC-V: Fix incorrect code gen for scalar signed SAT_SUB [PR117688]

2025-01-26 Thread Jeff Law




On 1/23/25 12:01 AM, pan2...@intel.com wrote:

From: Pan Li 

This patch would like to fix the wroing code generation for the scalar
signed SAT_SUB.  The input can be QI/HI/SI/DI while the alu like sub
can only work on Xmode.  Unfortunately we don't have sub/add for
non-Xmode like QImode in scalar, thus we need to sign extend to Xmode
to ensure we have the correct value before ALU like sub.  The gen_lowpart
will generate something like lbu which has all zero for highest bits.

For example, when 0xff(-1 for QImode) sub 0x1(1 for QImode), we actually
want to -1 - 1 = -2, but if there is no sign extend like lbu, we will get
0xff - 1 = 0xfe which is incorrect.  Thus, we have to sign extend 0xff(Qmode)
to 0x(assume XImode is DImode) before sub in Xmode.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

PR target/117688

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_expand_sssub): Leverage the helper
riscv_extend_to_xmode_reg with SIGN_EXTEND.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/pr117688.h: Add test helper macro.
* gcc.target/riscv/pr117688-sub-run-1-s16.c: New test.
* gcc.target/riscv/pr117688-sub-run-1-s32.c: New test.
* gcc.target/riscv/pr117688-sub-run-1-s64.c: New test.
* gcc.target/riscv/pr117688-sub-run-1-s8.c: New test.
Again, conceptually OK.  We'll just need to make sure to retest after 
you adjust the helper.  Similarly for patch #4 in this series.



jeff



[committed, obvious] OpenMP: Fix typo in atomic directive error message

2025-01-26 Thread Sandra Loosemore
gcc/fortran/ChangeLog
* openmp.cc (resolve_omp_atomic): Fix typo in error message.

gcc/testsuite/ChangeLog
* gfortran.dg/gomp/atomic-26.f90: Correct expected output after
fixing typo in error message.
---
 gcc/fortran/openmp.cc| 2 +-
 gcc/testsuite/gfortran.dg/gomp/atomic-26.f90 | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
index be78aa1ab27..7875341b2cf 100644
--- a/gcc/fortran/openmp.cc
+++ b/gcc/fortran/openmp.cc
@@ -10410,7 +10410,7 @@ resolve_omp_atomic (gfc_code *code)
   gfc_intrinsic_op alt_op = INTRINSIC_NONE;
 
   if (atomic_code->ext.omp_clauses->fail != OMP_MEMORDER_UNSET)
-   gfc_error ("!$OMP ATOMIC UPDATE at %L with FAIL clause requiries either"
+   gfc_error ("!$OMP ATOMIC UPDATE at %L with FAIL clause requires either"
   " the COMPARE clause or using the intrinsic MIN/MAX "
   "procedure", &atomic_code->loc);
   switch (op)
diff --git a/gcc/testsuite/gfortran.dg/gomp/atomic-26.f90 
b/gcc/testsuite/gfortran.dg/gomp/atomic-26.f90
index 6448bd9b8bb..3d88cd72d8d 100644
--- a/gcc/testsuite/gfortran.dg/gomp/atomic-26.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/atomic-26.f90
@@ -38,11 +38,11 @@ real function bar (y, e, f)
   v = d
   !$omp atomic fail(relaxed), write! { dg-error "FAIL clause is 
incompatible with READ or WRITE" }
   d = v
-  !$omp atomic fail(relaxed) update! { dg-error "FAIL clause requiries 
either the COMPARE clause or using the intrinsic MIN/MAX procedure" }
+  !$omp atomic fail(relaxed) update! { dg-error "FAIL clause requires 
either the COMPARE clause or using the intrinsic MIN/MAX procedure" }
   d = d + 3.0
-  !$omp atomic fail(relaxed)   ! { dg-error "FAIL clause requiries either the 
COMPARE clause or using the intrinsic MIN/MAX procedure" }
+  !$omp atomic fail(relaxed)   ! { dg-error "FAIL clause requires either the 
COMPARE clause or using the intrinsic MIN/MAX procedure" }
   d = d + 3.0
-  !$omp atomic capture fail(relaxed)   ! { dg-error "FAIL clause requiries 
either the COMPARE clause or using the intrinsic MIN/MAX procedure" }
+  !$omp atomic capture fail(relaxed)   ! { dg-error "FAIL clause requires 
either the COMPARE clause or using the intrinsic MIN/MAX procedure" }
   v = d; d = d + 3.0
   !$omp atomic read weak   ! { dg-error "WEAK clause requires 
COMPARE clause" }
   v = d
-- 
2.34.1