[committed] Fortran: In openmp.cc, uncomment unroll/tile lines of gfc_omp_directives
This was seemingly forgotten when UNROLL/TILE was added. Committed asr15-7220-g7cd133a6e4b042 as obvious. Tobias commit 7cd133a6e4b04262620489dbf4b4e3ae5e96c95f Author: Tobias Burnus Date: Mon Jan 27 00:35:17 2025 +0100 Fortran: In openmp.cc, uncomment unroll/tile lines of gfc_omp_directives Enable unroll and tile for assume's contains/absent clauses as both directives are implemented since r15-1037-g804c0f35a6b1d7. gcc/fortran/ChangeLog: * openmp.cc (gfc_omp_directives): Uncomment unroll and tile lines as the directives are by now implemented. --- gcc/fortran/openmp.cc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc index 7875341b2cf..35661d88f1e 100644 --- a/gcc/fortran/openmp.cc +++ b/gcc/fortran/openmp.cc @@ -109,8 +109,8 @@ static const struct gfc_omp_directive gfc_omp_directives[] = { {"task", GFC_OMP_DIR_EXECUTABLE, ST_OMP_TASK}, {"teams", GFC_OMP_DIR_EXECUTABLE, ST_OMP_TEAMS}, {"threadprivate", GFC_OMP_DIR_DECLARATIVE, ST_OMP_THREADPRIVATE}, - /* {"tile", GFC_OMP_DIR_EXECUTABLE, ST_OMP_TILE}, */ - /* {"unroll", GFC_OMP_DIR_EXECUTABLE, ST_OMP_UNROLL}, */ + {"tile", GFC_OMP_DIR_EXECUTABLE, ST_OMP_TILE}, + {"unroll", GFC_OMP_DIR_EXECUTABLE, ST_OMP_UNROLL}, {"workshare", GFC_OMP_DIR_EXECUTABLE, ST_OMP_WORKSHARE}, };
Re: [PATCH v6 4/6] OpenMP: Fortran support for metadirectives and dynamic selectors
Hi Sandra, this patch LGTM with some minor comments. Or rather: I have a few minor comments that should be fixed right away and a few larger items for which PRs should be filed. See below. Sandra Loosemore wrote: gcc/fortran/ChangeLog PR middle-end/112779 PR middle-end/113904 * decl.cc (gfc_match_end): Handle COMP_OMP_BEGIN_METADIRECTIVE and COMP_OMP_METADIRECTIVE. * dump-parse-tree.cc (show_omp_node): Handle EXEC_OMP_METADIRECTIVE. (show_code_node): Likewise. * gfortran.h (enum gfc_statement): Add ST_OMP_METADIRECTIVE, ST_OMP_BEGIN_METADIRECTIVE, and ST_OMP_END_METADIRECTIVE. (struct gfc_omp_clauses): Rename target_first_st_is_teams to target_first_st_is_teams_or_meta. (struct gfc_omp_variant): New. (gfc_get_omp_variant): New. (struct gfc_st_label): Add omp_region field. (enum gfc_exec_op): Add EXEC_OMP_METADIRECTIVE. (struct gfc_code): Add omp_variants fields. (gfc_free_omp_variants): Declare. (match_omp_directive): Declare. (is_omp_declarative_stmt): Declare. * io.cc (format_asterisk): Adjust initializer. * match.h (gfc_match_omp_begin_metadirective): Declare. (gfc_match_omp_metadirective): Declare. * openmp.cc (gfc_match_omp_eos): Adjust to match context selectors. (gfc_free_omp_variants): New. (gfc_match_omp_clauses): Remove context_selector parameter and adjust to use gfc_match_omp_eos instead. (match_omp): Adjust call to gfc_match_omp_clauses. (gfc_match_omp_context_selector): Add metadirective_p parameter and adjust error-checking. Adjust matching of simd clauses. (gfc_match_omp_context_selector_specification): Adjust parameters so it can be used for metadirective as well as declare variant. (match_omp_metadirective): New. (gfc_match_omp_begin_metadirective): New. (gfc_match_omp_metadirective): New. (resolve_omp_metadirective): New. (resolve_omp_target): Handle metadirectives. (gfc_resolve_omp_directive): Handle EXEC_OMP_METADIRECTIVE. * parse.cc (gfc_matching_omp_context_selector): New. (gfc_in_omp_metadirective_body): New. (gfc_omp_region_count): New. (decode_omp_directive): Handle ST_OMP_BEGIN_METADIRECTIVE and ST_OMP_METADIRECTIVE. (match_omp_directive): New. (case_omp_structured_block): Define. (case_omp_do): Define. (gfc_ascii_statement): Handle ST_OMP_BEGIN_METADIRECTIVE, ST_OMP_END_METADIRECTIVE, and ST_OMP_METADIRECTIVE. (accept_statement): Handle ST_OMP_METADIRECTIVE and ST_OMP_BEGIN_METADIRECTIVE. (gfc_omp_end_stmt): New, split from... (parse_omp_do): ...here, and... (parse_omp_structured_block): ...here. Handle metadirectives. (parse_omp_metadirective_body): New. (parse_executable): Handle metadirective. Use new case macros defined above. (gfc_parse_file): Initialize metadirective state. (is_omp_declarative_stmt): New. * parse.h (enum gfc_compile_state): Add COMP_OMP_METADIRECTIVE and COMP_OMP_BEGIN_METADIRECTIVE. (gfc_omp_end_stmt): Declare. (gfc_matching_omp_context_selector): Declare. (gfc_in_omp_metadirective_body): Declare. (gfc_omp_metadirective_region_count): Declare. * resolve.cc (gfc_resolve_code): Handle EXEC_OMP_METADIRECTIVE. * st.cc (gfc_free_statement): Likewise. * symbol.cc (compare_st_labels): Handle labels within a metadirective body. (gfc_get_st_label): Likewise. * trans-decl.cc (gfc_get_label_decl): Encode the metadirective region in the label_name. * trans-openmp.cc (gfc_trans_omp_directive): Handle EXEC_OMP_METADIRECTIVE. (gfc_trans_omp_set_selector): New, split/adapted from code (gfc_trans_omp_declare_variant): ...here. (gfc_trans_omp_metadirective): New. * trans-stmt.h (gfc_trans_omp_metadirective): Declare. * trans.cc (trans_code): Handle EXEC_OMP_METADIRECTIVE. gcc/testsuite/ChangeLog PR middle-end/112779 PR middle-end/113904 * gfortran.dg/gomp/metadirective-1.f90: New. * gfortran.dg/gomp/metadirective-10.f90: New. * gfortran.dg/gomp/metadirective-11.f90: New. * gfortran.dg/gomp/metadirective-12.f90: New. * gfortran.dg/gomp/metadirective-2.f90: New. * gfortran.dg/gomp/metadirective-3.f90: New. * gfortran.dg/gomp/metadirective-4.f90: New. * gfortran.dg/gomp/metadirective-5.f90: New. * gfortran.dg/gomp/metadirective-6.f90: New. * gfortran.dg/gomp/metadirective-7.f90: New. * gfortran.dg/gomp/metadirective-8.f90: New. * gfortran.dg/gomp/metadirective-9.f90: New. * gfortran.dg/gomp/metadirective-construct.f90: New. * gfortran.dg
Re: [PATCH v2 7/7] Alpha: Add option to avoid data races for partial writes [PR117759]
"Maciej W. Rozycki" writes: ... > There are notable regressions between a plain `-mno-bwx' configuration > and a `-mno-bwx -msafe-partial' one: > > FAIL: gm2/iso/run/pass/strcons.mod execution, -g > FAIL: gm2/iso/run/pass/strcons.mod execution, -O > FAIL: gm2/iso/run/pass/strcons.mod execution, -O -g > FAIL: gm2/iso/run/pass/strcons.mod execution, -Os > FAIL: gm2/iso/run/pass/strcons.mod execution, -O3 -fomit-frame-pointer > FAIL: gm2/iso/run/pass/strcons.mod execution, -O3 -fomit-frame-pointer > -finline-functions > FAIL: gm2/iso/run/pass/strcons4.mod execution, -g > FAIL: gm2/iso/run/pass/strcons4.mod execution, -O > FAIL: gm2/iso/run/pass/strcons4.mod execution, -O -g > FAIL: gm2/iso/run/pass/strcons4.mod execution, -Os > FAIL: gm2/iso/run/pass/strcons4.mod execution, -O3 -fomit-frame-pointer > FAIL: gm2/iso/run/pass/strcons4.mod execution, -O3 -fomit-frame-pointer > -finline-functions > > Just as with `-msafe-bwa' regressions they come from the fact that these > test cases end up calling code that expects a reference to aligned data > but is handed one to unaligned data, causing an alignment exception with > LDL_L or LDQ_L, which will eventually be fixed up by Linux. > > In some cases GCC chooses to open-code block memory write operations, so > with non-BWX targets `-msafe-partial' will in the usual case have to be > used together with `-msafe-bwa'. > I've logged PR 118600 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118600 and have an experimental proposed patch and changelog below. In summary the patch tests every assignment (of a constructor to a designator) to ensure the types are GCC equivalent. If they are equivalent then it uses assignment and if not then it copies a structure by field and uses strncpy to copy a string cst into an array. I wonder if these changes fix the regression test failures seen on Alpha above? regards, Gaius -- PR modula2/118600 Assigning to a record causes alignment exception This patch recursively tests every assignment (of a constructor to a designator) to ensure the types are GCC equivalent. If they are equivalent then it uses gimple assignment and if not then it copies a structure by field and uses __builtin_strncpy to copy a string cst into an array. Unions are copied by __builtin_memcpy. gcc/m2/ChangeLog: * gm2-compiler/M2GenGCC.mod (PerformCodeBecomes): New procedure. (CodeBecomes): Refactor and call PerformCodeBecomes. * gm2-gcc/m2builtins.cc (gm2_strncpy_node): New global variable. (DoBuiltinStrNCopy): New function. (m2builtins_BuiltinStrNCopy): New function. (m2builtins_init): Initialize gm2_strncpy_node. * gm2-gcc/m2builtins.def (BuiltinStrNCopy): New procedure function. * gm2-gcc/m2builtins.h (m2builtins_BuiltinStrNCopy): New function. * gm2-gcc/m2statement.cc (copy_record_fields): New function. (copy_array): Ditto. (copy_strncpy): Ditto. (copy_memcpy): Ditto. (CopyByField_Lower): Ditto. (m2statement_CopyByField): Ditto. * gm2-gcc/m2statement.def (CopyByField): New procedure function. * gm2-gcc/m2statement.h (m2statement_CopyByField): New function. * gm2-gcc/m2type.cc (check_record_fields): Ditto. (check_array_types): Ditto. (m2type_IsGccStrictTypeEquivalent): Ditto. * gm2-gcc/m2type.def (IsGccStrictTypeEquivalent): New procedure function. * gm2-gcc/m2type.h (m2type_IsAddress): Replace return type int with bool. diff --git a/gcc/m2/gm2-compiler/M2GenGCC.mod b/gcc/m2/gm2-compiler/M2GenGCC.mod index bba77ff12e1..912dfe7b8e8 100644 --- a/gcc/m2/gm2-compiler/M2GenGCC.mod +++ b/gcc/m2/gm2-compiler/M2GenGCC.mod @@ -43,7 +43,7 @@ FROM SymbolTable IMPORT PushSize, PopSize, PushValue, PopValue, IsConst, IsConstSet, IsProcedure, IsProcType, IsVar, IsVarParamAny, IsTemporary, IsTuple, IsEnumeration, -IsUnbounded, IsArray, IsSet, IsConstructor, +IsUnbounded, IsArray, IsSet, IsConstructor, IsConstructorConstant, IsProcedureVariable, IsUnboundedParamAny, IsRecordField, IsFieldVarient, IsVarient, IsRecord, @@ -231,7 +231,7 @@ FROM m2statement IMPORT BuildAsm, BuildProcedureCallTree, BuildParam, BuildFunct BuildReturnValueCode, SetLastFunction, BuildIncludeVarConst, BuildIncludeVarVar, BuildExcludeVarConst, BuildExcludeVarVar, -BuildBuiltinCallTree, +BuildBuiltinCallTree, CopyByField, GetParamTree, BuildCleanUp, BuildTryFinally, GetLastFunction, SetLastFunction, @@ -240,7 +240,7 @@ FROM m2statement IMPORT BuildAsm, BuildProcedureCallTree, BuildParam, Bu
[PATCH] Fortran: fix bogus diagnostics on renamed interface import [PR110993]
Dear all, in the checking of imported interfaces we need to use the local names of procedures that are renamed-on-use, as the original name becomes inaccessible. Similarly, we should not compare interfaces of non-bind(C) procedures against bind(C) interfaces that are not explicitly made accessible via a use statement, see testcase. Regtested on x86_64-pc-linux-gnu. OK for mainline? Could this one be backportable, e.g. to 14-branch? Thanks, Harald From fb19a4bd29f49935514a7c2a43dbc9f2a6e9b147 Mon Sep 17 00:00:00 2001 From: Harald Anlauf Date: Sun, 26 Jan 2025 22:56:57 +0100 Subject: [PATCH] Fortran: fix bogus diagnostics on renamed interface import [PR110993] PR fortran/110993 gcc/fortran/ChangeLog: * frontend-passes.cc (check_externals_procedure): Do not compare interfaces of a non-bind(C) procedure against a bind(C) global one. (check_against_globals): Use local name from rename-on-use in the search for interfaces. gcc/testsuite/ChangeLog: * gfortran.dg/use_rename_14.f90: New test. --- gcc/fortran/frontend-passes.cc | 7 gcc/testsuite/gfortran.dg/use_rename_14.f90 | 46 + 2 files changed, 53 insertions(+) create mode 100644 gcc/testsuite/gfortran.dg/use_rename_14.f90 diff --git a/gcc/fortran/frontend-passes.cc b/gcc/fortran/frontend-passes.cc index 987238794da..6b470b83e21 100644 --- a/gcc/fortran/frontend-passes.cc +++ b/gcc/fortran/frontend-passes.cc @@ -5704,6 +5704,9 @@ check_externals_procedure (gfc_symbol *sym, locus *loc, if (gsym->ns) gfc_find_symbol (sym->name, gsym->ns, 0, &def_sym); + if (gsym->bind_c && def_sym && def_sym->binding_label == NULL) +return 0; + if (def_sym) { gfc_compare_actual_formal (&actual, def_sym->formal, 0, 0, 0, loc); @@ -5800,6 +5803,10 @@ check_against_globals (gfc_symbol *sym) if (sym->binding_label) sym_name = sym->binding_label; + else if (sym->attr.use_rename + && sym->ns->use_stmts->rename + && sym->ns->use_stmts->rename->local_name[0] != '\0') +sym_name = sym->ns->use_stmts->rename->local_name; else sym_name = sym->name; diff --git a/gcc/testsuite/gfortran.dg/use_rename_14.f90 b/gcc/testsuite/gfortran.dg/use_rename_14.f90 new file mode 100644 index 000..03815a5f229 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/use_rename_14.f90 @@ -0,0 +1,46 @@ +! { dg-do compile } +! +! PR fortran/110993 - bogus diagnostics on renamed interface import +! +! Contributed by Rimvydas Jasinskas + +module m + interface +subroutine bar(x) + use iso_c_binding, only : c_float + implicit none + real(c_float) :: x(45) +end subroutine + end interface +end + +module m1 + interface +subroutine bar1(x) bind(c) + use iso_c_binding, only : c_float + implicit none + real(c_float) :: x(45) +end subroutine + end interface +end + +module m2 + interface +subroutine bar2(x) bind(c, name="bar2_") + use iso_c_binding, only : c_float + implicit none + real(c_float) :: x(45) +end subroutine + end interface +end + +subroutine foo(y) + use m, notthisone => bar + use m1, northisone => bar1 + use m2, orthisone => bar2 + implicit none + real :: y(3) + call bar (y) + call bar1(y) + call bar2(y) +end subroutine -- 2.43.0
Re: [PATCH] Fortran: fix bogus diagnostics on renamed interface import [PR110993]
On 1/26/25 2:07 PM, Harald Anlauf wrote: Dear all, in the checking of imported interfaces we need to use the local names of procedures that are renamed-on-use, as the original name becomes inaccessible. Similarly, we should not compare interfaces of non-bind(C) procedures against bind(C) interfaces that are not explicitly made accessible via a use statement, see testcase. Regtested on x86_64-pc-linux-gnu. OK for mainline? Could this one be backportable, e.g. to 14-branch? Thanks, Harald This is OK. Backport up to you. Jerry
[PATCH v1] RISC-V: Remove unnecessary frm restore volatile define_insn
From: Pan Li After we add the frm register to the global_regs, we may not need to define_insn that volatile to emit the frm restore insns. The cooperatively-managed global register will help to handle this, instead of emit the volatile define_insn explicitly. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_emit_frm_mode_set): Refactor the frm mode set by removing fsrmsi_restore_volatile. * config/riscv/vector-iterators.md (unspecv): Remove as unnecessary. * config/riscv/vector.md (fsrmsi_restore_volatile): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/float-point-dynamic-frm-49.c: Adjust the asm dump check times. * gcc.target/riscv/rvv/base/float-point-dynamic-frm-50.c: Ditto. * gcc.target/riscv/rvv/base/float-point-dynamic-frm-52.c: Ditto. * gcc.target/riscv/rvv/base/float-point-dynamic-frm-74.c: Ditto. * gcc.target/riscv/rvv/base/float-point-dynamic-frm-75.c: Ditto. Signed-off-by: Pan Li --- gcc/config/riscv/riscv.cc | 43 ++- gcc/config/riscv/vector-iterators.md | 4 -- gcc/config/riscv/vector.md| 13 -- .../rvv/base/float-point-dynamic-frm-49.c | 2 +- .../rvv/base/float-point-dynamic-frm-50.c | 2 +- .../rvv/base/float-point-dynamic-frm-52.c | 2 +- .../rvv/base/float-point-dynamic-frm-74.c | 2 +- .../rvv/base/float-point-dynamic-frm-75.c | 2 +- 8 files changed, 28 insertions(+), 42 deletions(-) diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index dd50fe4eddf..8e3bf0077cd 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -12031,27 +12031,30 @@ riscv_emit_frm_mode_set (int mode, int prev_mode) if (prev_mode == riscv_vector::FRM_DYN_CALL) emit_insn (gen_frrmsi (backup_reg)); /* Backup frm when DYN_CALL. */ - if (mode != prev_mode) -{ - rtx frm = gen_int_mode (mode, SImode); - - if (mode == riscv_vector::FRM_DYN_CALL - && prev_mode != riscv_vector::FRM_DYN && STATIC_FRM_P (cfun)) - /* No need to emit when prev mode is DYN already. */ - emit_insn (gen_fsrmsi_restore_volatile (backup_reg)); - else if (mode == riscv_vector::FRM_DYN_EXIT && STATIC_FRM_P (cfun) - && prev_mode != riscv_vector::FRM_DYN - && prev_mode != riscv_vector::FRM_DYN_CALL) - /* No need to emit when prev mode is DYN or DYN_CALL already. */ - emit_insn (gen_fsrmsi_restore_volatile (backup_reg)); - else if (mode == riscv_vector::FRM_DYN - && prev_mode != riscv_vector::FRM_DYN_CALL) - /* Restore frm value from backup when switch to DYN mode. */ - emit_insn (gen_fsrmsi_restore (backup_reg)); - else if (riscv_static_frm_mode_p (mode)) - /* Set frm value when switch to static mode. */ - emit_insn (gen_fsrmsi_restore (frm)); + if (mode == prev_mode) +return; + + if (riscv_static_frm_mode_p (mode)) +{ + /* Set frm value when switch to static mode. */ + emit_insn (gen_fsrmsi_restore (gen_int_mode (mode, SImode))); + return; } + + bool restore_p += /* No need to emit when prev mode is DYN. */ + (STATIC_FRM_P (cfun) && mode == riscv_vector::FRM_DYN_CALL + && prev_mode != riscv_vector::FRM_DYN) + /* No need to emit if prev mode is DYN or DYN_CALL. */ + || (STATIC_FRM_P (cfun) && mode == riscv_vector::FRM_DYN_EXIT + && prev_mode != riscv_vector::FRM_DYN + && prev_mode != riscv_vector::FRM_DYN_CALL) + /* Restore frm value when switch to DYN mode. */ + || (mode == riscv_vector::FRM_DYN + && prev_mode != riscv_vector::FRM_DYN_CALL); + + if (restore_p) +emit_insn (gen_fsrmsi_restore (backup_reg)); } /* Implement Mode switching. */ diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md index c1bd7397441..f64e7ad70dd 100644 --- a/gcc/config/riscv/vector-iterators.md +++ b/gcc/config/riscv/vector-iterators.md @@ -122,10 +122,6 @@ (define_c_enum "unspec" [ UNSPEC_SF_VFNRCLIPU ]) -(define_c_enum "unspecv" [ - UNSPECV_FRM_RESTORE_EXIT -]) - ;; Subset of VI with fractional LMUL types (define_mode_iterator VI_FRAC [ RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32") diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index cf22b39d6cb..fe10eabeb2e 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -1116,19 +1116,6 @@ (define_insn "fsrmsi_restore" (set_attr "mode" "SI")] ) -;; The volatile fsrmsi restore is used for the exit point for the -;; dynamic mode switching. It will generate one volatile fsrm a5 -;; which won't be eliminated. -(define_insn "fsrmsi_restore_volatile" - [(set (reg:SI FRM_REGNUM) - (unspec_volatile:SI [(match_operand:SI 0 "register_operand" "r")] - UNSPECV_FRM_RESTORE_EXIT))] - "TARGET_VECTOR" - "fsrm\t%0" - [(set_attr "type" "wrfrm") - (set_attr "mode"
Re: [PATCH] RISC-V: Disable two-source permutes for now [PR117173].
On 1/24/25 3:57 AM, Robin Dapp wrote: So this isn't a regression, but I can also understand the desire to fix this fairly significant performance issue. I'd argue it is a regression as the match.pd pattern that merges the permutes was introduces after GCC 14. Good point. I hadn't thought about it as resolving the performance regression from that change. After giving it a bit more thought, I'd still like to send the attached v2 because it excludes fewer cases and, consequently, requires fewer changes to the test suite. Regtested on rv64gcv_zvl512b. Regards Robin [PATCH v2] RISC-V: Disable two-source permutes for now [PR117173]. After testing on the BPI (4.2% improvement for x264 input 1, 4.4% for input 2) and the discussion in PR117173 I figured it's best to disable the two-source permutes by default for now. The patch adds a parameter "riscv-two-source-permutes" which restores the old behavior. PR target/117173 gcc/ChangeLog: * config/riscv/riscv-v.cc (shuffle_generic_patterns): Only support single-source permutes by default. * config/riscv/riscv.opt: New param "riscv-two-source-permutes". gcc/testsuite/ChangeLog: * gcc.dg/fold-perm-2.c: Run with two-source permutes. * gcc.dg/pr54346.c: Ditto. OK jeff
Re: [PATCH v1] RISC-V: Remove unnecessary frm restore volatile define_insn
On 1/26/25 6:33 AM, pan2...@intel.com wrote: From: Pan Li After we add the frm register to the global_regs, we may not need to define_insn that volatile to emit the frm restore insns. The cooperatively-managed global register will help to handle this, instead of emit the volatile define_insn explicitly. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_emit_frm_mode_set): Refactor the frm mode set by removing fsrmsi_restore_volatile. * config/riscv/vector-iterators.md (unspecv): Remove as unnecessary. * config/riscv/vector.md (fsrmsi_restore_volatile): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/float-point-dynamic-frm-49.c: Adjust the asm dump check times. * gcc.target/riscv/rvv/base/float-point-dynamic-frm-50.c: Ditto. * gcc.target/riscv/rvv/base/float-point-dynamic-frm-52.c: Ditto. * gcc.target/riscv/rvv/base/float-point-dynamic-frm-74.c: Ditto. * gcc.target/riscv/rvv/base/float-point-dynamic-frm-75.c: Ditto. It's a nice cleanup, but let's defer since it doesn't fix a bug. jeff
RE: [PATCH v1] RISC-V: Remove unnecessary frm restore volatile define_insn
> It's a nice cleanup, but let's defer since it doesn't fix a bug. Sure thing, will defer to gcc-16. Pan -Original Message- From: Jeff Law Sent: Sunday, January 26, 2025 9:34 PM To: Li, Pan2 ; gcc-patches@gcc.gnu.org Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; rdapp@gmail.com; vine...@rivosinc.com Subject: Re: [PATCH v1] RISC-V: Remove unnecessary frm restore volatile define_insn On 1/26/25 6:33 AM, pan2...@intel.com wrote: > From: Pan Li > > After we add the frm register to the global_regs, we may not need to > define_insn that volatile to emit the frm restore insns. The > cooperatively-managed global register will help to handle this, instead > of emit the volatile define_insn explicitly. > > gcc/ChangeLog: > > * config/riscv/riscv.cc (riscv_emit_frm_mode_set): Refactor > the frm mode set by removing fsrmsi_restore_volatile. > * config/riscv/vector-iterators.md (unspecv): Remove as unnecessary. > * config/riscv/vector.md (fsrmsi_restore_volatile): Ditto. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/base/float-point-dynamic-frm-49.c: Adjust > the asm dump check times. > * gcc.target/riscv/rvv/base/float-point-dynamic-frm-50.c: Ditto. > * gcc.target/riscv/rvv/base/float-point-dynamic-frm-52.c: Ditto. > * gcc.target/riscv/rvv/base/float-point-dynamic-frm-74.c: Ditto. > * gcc.target/riscv/rvv/base/float-point-dynamic-frm-75.c: Ditto. It's a nice cleanup, but let's defer since it doesn't fix a bug. jeff
[RFC PATCH] i386: Re-alias -mavx10.2 to 512 bit and make -mno-avx10.x-512 disable the whole AVX10.x
Hi all, AVX10 has been published for one and half year and we have got many feedbacks on that, one of the feedback is on whether the alias option -mavx10.x should point to 256 or 512. If you also pay attention to LLVM community, you might see this thread related to AVX10 options just sent out several hours ago: [X86][AVX10] Disable m[no-]avx10.1 and switch m[no-]avx10.2 to alias of 512 bit options https://github.com/llvm/llvm-project/pull/124511 In GCC, we will also do so. This RFC patch is slightly different with LLVM, just including: - Switch -m[no-]avx10.2 to alias of 512 bit options. - Change -mno-avx10.[1,2]-512 to disable both 256 and 512 instructions. This will also result in -mno-avx10.2 would still disable both 256 and 512 insts according to new alias point to 512. But not including disabling -m[no-]avx10.1, since I still want more input on how to handle that. We actually have three choices on that: a. Directly re-alias -m[no-]avx10.1 to -m[no-]avx10.1-512 GCC 15 and backport to GCC 14. b. Disable -m[no]-avx10.1 in GCC 15, and add it back with -m[no-]avx10.1-512 in the future. This is for in case if someone cross compile with different versions of GCC with -mavx10.1, it might get unexpected result sliently. c. Disable -m[no]-avx10.1 in GCC 15, and never add it back. Since the option has been 256 bit, changing them back and forth is messy. It might be the final chance we could change the alias option since real AVX10.1 hardware is coming soon. And it is only x86 specific, so it might still squeeze into GCC 15 at this time. I call this patch RFC patch since we also need to change the doc and testcases accordingly, which makes this patch incomplete. Discussion and input is welcomed on this topic. Thx, Haochen --- gcc/common/config/i386/i386-common.cc | 30 +-- gcc/common/config/i386/i386-isas.h| 2 +- gcc/config/i386/i386-options.cc | 2 +- gcc/config/i386/i386.opt | 4 ++-- gcc/doc/extend.texi | 8 --- gcc/doc/sourcebuild.texi | 4 ++-- 6 files changed, 25 insertions(+), 25 deletions(-) diff --git a/gcc/common/config/i386/i386-common.cc b/gcc/common/config/i386/i386-common.cc index 52ad1c5acd1..3891fca8ecb 100644 --- a/gcc/common/config/i386/i386-common.cc +++ b/gcc/common/config/i386/i386-common.cc @@ -325,14 +325,12 @@ along with GCC; see the file COPYING3. If not see #define OPTION_MASK_ISA2_APX_F_UNSET OPTION_MASK_ISA2_APX_F #define OPTION_MASK_ISA2_EVEX512_UNSET OPTION_MASK_ISA2_EVEX512 #define OPTION_MASK_ISA2_USER_MSR_UNSET OPTION_MASK_ISA2_USER_MSR -#define OPTION_MASK_ISA2_AVX10_1_256_UNSET \ - (OPTION_MASK_ISA2_AVX10_1_256 | OPTION_MASK_ISA2_AVX10_1_512_UNSET \ - | OPTION_MASK_ISA2_AVX10_2_256_UNSET) -#define OPTION_MASK_ISA2_AVX10_1_512_UNSET \ - (OPTION_MASK_ISA2_AVX10_1_512 | OPTION_MASK_ISA2_AVX10_2_512_UNSET) -#define OPTION_MASK_ISA2_AVX10_2_256_UNSET OPTION_MASK_ISA2_AVX10_2_256 -#define OPTION_MASK_ISA2_AVX10_2_512_UNSET \ - (OPTION_MASK_ISA2_AVX10_2_512 | OPTION_MASK_ISA2_AMX_AVX512_UNSET) +#define OPTION_MASK_ISA2_AVX10_1_UNSET \ + (OPTION_MASK_ISA2_AVX10_1_256 | OPTION_MASK_ISA2_AVX10_1_512 \ + | OPTION_MASK_ISA2_AVX10_2_UNSET) +#define OPTION_MASK_ISA2_AVX10_2_UNSET \ + (OPTION_MASK_ISA2_AVX10_2_256 | OPTION_MASK_ISA2_AVX10_2_512 \ + OPTION_MASK_ISA2_AMX_AVX512_UNSET) #define OPTION_MASK_ISA2_AMX_AVX512_UNSET OPTION_MASK_ISA2_AMX_AVX512 #define OPTION_MASK_ISA2_AMX_TF32_UNSET OPTION_MASK_ISA2_AMX_TF32 #define OPTION_MASK_ISA2_AMX_TRANSPOSE_UNSET OPTION_MASK_ISA2_AMX_TRANSPOSE @@ -1378,8 +1376,8 @@ ix86_handle_option (struct gcc_options *opts, } else { - opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_AVX10_1_256_UNSET; - opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX10_1_256_UNSET; + opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_AVX10_1_UNSET; + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX10_1_UNSET; opts->x_ix86_no_avx10_1_explicit = 1; } return true; @@ -1394,8 +1392,8 @@ ix86_handle_option (struct gcc_options *opts, } else { - opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_AVX10_1_512_UNSET; - opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX10_1_512_UNSET; + opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_AVX10_1_UNSET; + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX10_1_UNSET; opts->x_ix86_no_avx10_1_explicit = 1; } return true; @@ -1410,8 +1408,8 @@ ix86_handle_option (struct gcc_options *opts, } else { - opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_AVX10_2_256_UNSET; - opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX10_2_256_UNSET; + opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_AVX10_2_UNSET; + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX10_2_UNSET; } return
RE: [PATCH v2 1/4] RISC-V: Refactor SAT_* operand rtx extend to reg help func [NFC]
Thanks Jeff, I will resolve the conflict and send v3 after test. Pan -Original Message- From: Jeff Law Sent: Monday, January 27, 2025 12:38 AM To: Li, Pan2 ; gcc-patches@gcc.gnu.org Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; rdapp@gmail.com Subject: Re: [PATCH v2 1/4] RISC-V: Refactor SAT_* operand rtx extend to reg help func [NFC] On 1/23/25 12:01 AM, pan2...@intel.com wrote: > From: Pan Li > > This patch would like to refactor the helper function of the SAT_* > scalar. The helper function will convert the define_pattern ops > to the xmode reg for the underlying code-gen. This patch add > new parameter for ZERO_EXTEND or SIGN_EXTEND if the input is const_int > or the mode is non-Xmode. > > The below test suites are passed for this patch. > * The rv64gcv fully regression test. > > gcc/ChangeLog: > > * config/riscv/riscv.cc (riscv_gen_zero_extend_rtx): Rename from ... > (riscv_extend_to_xmode_reg): Rename to and add rtx_code for > zero/sign extend if non-Xmode. > (riscv_expand_usadd): Leverage the renamed function with ZERO_EXTEND. > (riscv_expand_ussub): Ditto. Note that I recently made a small change to riscv_gen_zero_extend_rtx that I think you need to incorporate into your patch. Otherwise I think you'll get a code quality regression on some of the saturation tests. Combine purposefully doesn't try to simplify expressions like (zero_extend (const_int ...)) or (sign_extend (const_int ...)). It's a historical wart, probably related to the lack of a mode on const_int objects IIRC. As a result if you ask the old code for an SImode 0x8000 you get a load of 0x8000 (lui) followed by a zero extend (two shifts). What you really want is a li to load 0x1, then a left shift. to produce 0x8000. This problem had been previously masked by the mvconst_internal pattern. The way to get this behavior is to take the incoming constant and mask off all the bits outside the desired mode. Then use gen_int_mode to actually generate a canonical const_int. Then force that into a register with force_reg. ie: > /* Combine deliberately does not simplify extensions of constants > (long story). So try to generate the zero extended constant > efficiently. > > First extract the constant and mask off all the bits not in MODE. */ > HOST_WIDE_INT val = INTVAL (x); > val &= GET_MODE_MASK (mode); > > /* X may need synthesis, so do not blindly copy it. */ > xmode_reg = force_reg (Xmode, gen_int_mode (val, Xmode)); I think the upstream ci system hasn't moved the baseline forward in about a week. As a result it's not reporting the failure to apply due to the conflict nor is it reporting the code quality regression. Jeff
[PATCH] libstdc++: Fix localized D_T_FMT %c formatting for [PR117214]
Formatting a time point with %c was implemented by calling std::vprint_to with format string constructed from locale's D_T_FMT string, but in some locales this string does not compliant to chrono-specs. So just use _M_locale_fmt to avoid this problem. libstdc++-v3/ChangeLog: PR libstdc++/117214 * include/bits/chrono_io.h (__formatter_chrono::_M_c): use _M_locale_fmt to format %c time point. * testsuite/std/time/format/pr117214.cc: New test. Signed-off-by: XU Kailiang --- libstdc++-v3/include/bits/chrono_io.h | 35 ++- .../testsuite/std/time/format/pr117214.cc | 32 + 2 files changed, 51 insertions(+), 16 deletions(-) create mode 100644 libstdc++-v3/testsuite/std/time/format/pr117214.cc diff --git a/libstdc++-v3/include/bits/chrono_io.h b/libstdc++- v3/include/bits/chrono_io.h index 6c813bf439d..9a4fa153a98 100644 --- a/libstdc++-v3/include/bits/chrono_io.h +++ b/libstdc++-v3/include/bits/chrono_io.h @@ -787,27 +787,30 @@ namespace __format template typename _FormatContext::iterator - _M_c(const _Tp& __tt, typename _FormatContext::iterator __out, + _M_c(const _Tp& __t, typename _FormatContext::iterator __out, _FormatContext& __ctx, bool __mod = false) const { // %c Locale's date and time representation. // %Ec Locale's alternate date and time representation. - basic_string<_CharT> __fmt; - auto __t = _S_floor_seconds(__tt); - locale __loc = _M_locale(__ctx); - const auto& __tp = use_facet<__timepunct<_CharT>>(__loc); - const _CharT* __formats[2]; - __tp._M_date_time_formats(__formats); - if (*__formats[__mod]) [[likely]] - { - __fmt = _GLIBCXX_WIDEN("{:L}"); - __fmt.insert(3u, __formats[__mod]); - } - else - __fmt = _GLIBCXX_WIDEN("{:L%a %b %e %T %Y}"); - return std::vformat_to(std::move(__out), __loc, __fmt, - std::make_format_args<_FormatContext>(__t)); + using namespace chrono; + auto __d = _S_days(__t); + using _TDays = decltype(__d); + const auto __ymd = _S_date(__d); + const auto __y = __ymd.year(); + const auto __hms = _S_hms(__t); + + struct tm __tm{}; + __tm.tm_year = (int)__y - 1900; + __tm.tm_yday = (__d - _TDays(__y/January/1)).count(); + __tm.tm_mon = (unsigned)_S_month(__t) - 1; + __tm.tm_mday = (unsigned)_S_day(__t); + __tm.tm_wday = _S_weekday(__t).c_encoding(); + __tm.tm_hour = __hms.hours().count(); + __tm.tm_min = __hms.minutes().count(); + __tm.tm_sec = __hms.seconds().count(); + return _M_locale_fmt(std::move(__out), _M_locale(__ctx), __tm, 'c', + __mod ? 'E' : '\0'); } template diff --git a/libstdc++-v3/testsuite/std/time/format/pr117214.cc b/libstdc++-v3/testsuite/std/time/format/pr117214.cc new file mode 100644 index 000..5b36edadfa0 --- /dev/null +++ b/libstdc++-v3/testsuite/std/time/format/pr117214.cc @@ -0,0 +1,32 @@ +// { dg-do run { target c++20 } } +// { dg-require-namedlocale "aa_DJ.UTF-8" } +// { dg-require-namedlocale "ar_SA.UTF-8" } +// { dg-require-namedlocale "ca_AD.UTF-8" } +// { dg-require-namedlocale "az_IR.UTF-8" } +// { dg-require-namedlocale "my_MM.UTF-8" } + +#include +#include +#include + +void +test_c() +{ + const char *test_locales[] = { + "aa_DJ.UTF-8", + "ar_SA.UTF-8", + "ca_AD.UTF-8", + "az_IR.UTF-8", + "my_MM.UTF-8", + }; + for (auto locale_name : test_locales) + { + std::locale::global(std::locale(locale_name)); + VERIFY( !std::format("{:L%c}", std::chrono::sys_seconds()).empty() ); + } +} + +int main() +{ + test_c(); +}
Re: [PATCH v3 0/4] Hard Register Constraints
On 1/23/25 8:49 AM, Stefan Schulze Frielinghaus wrote: On Sat, Jan 18, 2025 at 09:36:14AM -0700, Jeff Law wrote: [...] Do we detect conflicts between a hard register constraint and another constraint which requires a singleton class? That's going to be an error I suspect, but curious if it's handled. That is a good point. Currently I suspect no. I will have a look. Thanks. It's not the most important thing on our plate, but given the way x86 is structured we probably need to do something sensible here. I also worry a bit about non-singleton classes that the target may have added to CLASS_LIKELY_SPILLED_P, though unlike the singleton case, there's at least a chance these will work, albeit potentially generating poor code when an object needs spilling. I also don't think it's terribly common to add non-singleton classes to that set. I was first worried that the single register class construct is somewhat special. To me, it turns out that they behave very similar to my current draft. Basically during LRA in process_alt_operands() I'm installing Yea, I would think they'd largely behave like your proposal. Given the presence of a singleton class one could use that to write an ASM just as effectively as your hard register proposal. And satisfying the constraints should boil down to the same basic process. What your proposal does is give users fine grained control as-if the port had a singleton class for every register -- without us having to add all those pesky register classes. [ ... ] (I have tested those only on x86_64 so far but I expect them to work on 32-bit, too, module int128) I will include those, and of course, similar ones for constraints b,c,d,S,D in a future patch revision. If there is any other target with non-ordinary register classes/constraints/whatnot just let me know and I will have a look. Thanks for pulling together some tests around that. armv7 might be interesting to play with, though I suspect it'll just work. It adds LO_REGS to the likely spilled classes when thumb is enabled. LO_REGS isn't a singleton class. So it may be worth a quick test on that target. A few others do similar things (arm, pru, etc). But again, I think it'll just work since it sounds like singletons already work. Jeff
Re: [PATCH] RISC-V: ensure needed FRM restore is not eliminable [PR118646]
On 1/24/25 3:12 PM, Vineet Gupta wrote: RV-Vector FP-INT insns use the rounding mode in FRM register which if explicitly set for V insn needs, is saved/restored (although from the psABI CC Spec, it is not clear if it actually a caller-saved or callee-saved). Anyhow in the failure case the save/restore were generated by the Mode Switch pass, but then eliminated by sched1:DCE and Late-Combine. Fix this by using unspec_volatile variant which won't be eliminated. This showed up as SPEC2017 527.cam4 runtime aborts in glibc:round_away() which checks for standard rounding modes and the "leaking" rounding mode due to the bug happened to be a non-standard RISC-V specific RMM "Round to Nearest, ties to Max". This is testsuite clean: Not sure how it could be clean as I think the test itself is busted ;-) As-is it'll trigger compile time failures: FAIL: gfortran.target/riscv/rvv/pr118646.f90 -O0 (test for excess errors) Excess errors: /home/jlaw/test/gcc/gcc/testsuite/gfortran.target/riscv/rvv/pr118646.f90:18:12: Warning: Deleted feature: End expression in DO loop at (1) must be integer /home/jlaw/test/gcc/gcc/testsuite/gfortran.target/riscv/rvv/pr118646.f90:22:15: Warning: Deleted feature: End expression in DO loop at (1) must be integer /home/jlaw/test/gcc/gcc/testsuite/gfortran.target/riscv/rvv/pr118646.f90:36:18: Warning: Deleted feature: End expression in DO loop at (1) must be integer A "-w" will work around that, but then there's no Fortran equivalent of main and we get a link error: FAIL: gfortran.target/riscv/rvv/pr118646.f90 -O0 (test for excess errors) Excess errors: /release/linux/.build/src/glibc-git-4d29ec7c/csu/../sysdeps/riscv/start.S:67:(.text+0x22): undefined reference to `main' UNRESOLVED: gfortran.target/riscv/rvv/pr118646.f90 -O0 compilation failed to produce executable What I wanted to do was use your testcase with Pan's patch to see if Pan's patch resolved both issues. Your compiler patch may still be desirable as well, I really haven't really evaluated that yet. jeff ps. Pre-commit also failed on the new test: https://github.com/ewlu/gcc-precommit-ci/issues/3051#issuecomment-2613516130
Re: [PATCH v3 0/4] Hard Register Constraints
On Sun, Jan 26, 2025 at 08:35:29AM -0700, Jeff Law wrote: > On 1/23/25 8:49 AM, Stefan Schulze Frielinghaus wrote: > > On Sat, Jan 18, 2025 at 09:36:14AM -0700, Jeff Law wrote: > > [...] > > > > > Do we detect conflicts between a hard register constraint and another > > > > > constraint which requires a singleton class? That's going to be an > > > > > error I > > > > > suspect, but curious if it's handled. > > > > > > > > That is a good point. Currently I suspect no. I will have a look. > > > Thanks. It's not the most important thing on our plate, but given the way > > > x86 is structured we probably need to do something sensible here. > > > > > > I also worry a bit about non-singleton classes that the target may have > > > added to CLASS_LIKELY_SPILLED_P, though unlike the singleton case, there's > > > at least a chance these will work, albeit potentially generating poor code > > > when an object needs spilling. I also don't think it's terribly common to > > > add non-singleton classes to that set. > > > > I was first worried that the single register class construct is somewhat > > special. To me, it turns out that they behave very similar to my > > current draft. Basically during LRA in process_alt_operands() I'm > > installing > Yea, I would think they'd largely behave like your proposal. Given the > presence of a singleton class one could use that to write an ASM just as > effectively as your hard register proposal. And satisfying the constraints > should boil down to the same basic process. > > What your proposal does is give users fine grained control as-if the port > had a singleton class for every register -- without us having to add all > those pesky register classes. Though, don't we have various hacks for small register classes? I mean e.g. targetm.class_likely_spilled_p calls in combine/cse/loop-invariant etc. If all the registers could be made to behave similarly, don't we need to create those similarly? Jakub
[PATCH] c++, v3: Implement for namespace statics CWG 2867 - Order of initialization for structured bindings [PR115769]
On Sat, Jan 25, 2025 at 10:53:50AM -0500, Jason Merrill wrote: > On 1/25/25 4:12 AM, Jakub Jelinek wrote: > > On Fri, Jan 24, 2025 at 07:07:15PM -0500, Jason Merrill wrote: > > > Hypothetically, but those cases are just either error or DECL_EXTERNAL. In > > > the error case we're failing anyway; in the external case all the > > > base/nonbase for a particular structured binding declaration should be > > > consistent. > > > > So shall I just remove all the prune_vars_needing_no_initialization hunks > > then or add gcc_checking_assert (!STATIC_INIT_DECOMP_BASE_P (t) && > > !STATIC_INIT_DECOMP_NONBASE_P (t)); for the DECL_EXTERNAL punt case? > > Just the assert sounds good. > > > > > Note, unfortunately it is hard to come up with a testcase that actually > > > > prunes something on purpose... > > > > > > Indeed, it shouldn't be possible. So like this? Passed x86_64-linux and i686-linux bootstrap/regtest. 2025-01-26 Jakub Jelinek PR c++/115769 gcc/cp/ * cp-tree.h (STATIC_INIT_DECOMP_BASE_P): Define. (STATIC_INIT_DECOMP_NONBASE_P): Define. * decl.cc (cp_finish_decl): Mark nodes in {static,tls}_aggregates with * decl2.cc (decomp_handle_one_var, decomp_finalize_var_list): New functions. (emit_partial_init_fini_fn): Use them. (prune_vars_needing_no_initialization): Assert STATIC_INIT_DECOMP_*BASE_P is not set on DECL_EXTERNAL vars to be pruned out. (partition_vars_for_init_fini): Use same priority for consecutive STATIC_INIT_DECOMP_*BASE_P vars and propagate those flags to new TREE_LISTs when possible. Formatting fix. (handle_tls_init): Use decomp_handle_one_var and decomp_finalize_var_list functions. gcc/testsuite/ * g++.dg/DRs/dr2867-5.C: New test. * g++.dg/DRs/dr2867-6.C: New test. * g++.dg/DRs/dr2867-7.C: New test. * g++.dg/DRs/dr2867-8.C: New test. --- gcc/cp/cp-tree.h.jj 2024-09-07 09:31:20.601484156 +0200 +++ gcc/cp/cp-tree.h2024-09-09 15:53:44.924112247 +0200 @@ -470,6 +470,7 @@ extern GTY(()) tree cp_global_trees[CPTI BASELINK_FUNCTIONS_MAYBE_INCOMPLETE_P (in BASELINK) BIND_EXPR_VEC_DTOR (in BIND_EXPR) ATOMIC_CONSTR_EXPR_FROM_CONCEPT_P (in ATOMIC_CONSTR) + STATIC_INIT_DECOMP_BASE_P (in the TREE_LIST for {static,tls}_aggregates) 2: IDENTIFIER_KIND_BIT_2 (in IDENTIFIER_NODE) ICS_THIS_FLAG (in _CONV) DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P (in VAR_DECL) @@ -489,6 +490,8 @@ extern GTY(()) tree cp_global_trees[CPTI IMPLICIT_CONV_EXPR_BRACED_INIT (in IMPLICIT_CONV_EXPR) PACK_EXPANSION_AUTO_P (in *_PACK_EXPANSION) contract_semantic (in ASSERTION_, PRECONDITION_, POSTCONDITION_STMT) + STATIC_INIT_DECOMP_NONBASE_P (in the TREE_LIST + for {static,tls}_aggregates) 3: IMPLICIT_RVALUE_P (in NON_LVALUE_EXPR or STATIC_CAST_EXPR) ICS_BAD_FLAG (in _CONV) FN_TRY_BLOCK_P (in TRY_BLOCK) @@ -5947,6 +5950,21 @@ extern bool defer_mangling_aliases; extern bool flag_noexcept_type; +/* True if this TREE_LIST in {static,tls}_aggregates is a for dynamic + initialization of namespace scope structured binding base or related + extended ref init temps. Temporaries from the initialization of + STATIC_INIT_DECOMP_BASE_P dynamic initializers should be destroyed only + after the last STATIC_INIT_DECOMP_NONBASE_P dynamic initializer following + it. */ +#define STATIC_INIT_DECOMP_BASE_P(NODE) \ + TREE_LANG_FLAG_1 (TREE_LIST_CHECK (NODE)) + +/* True if this TREE_LIST in {static,tls}_aggregates is a for dynamic + initialization of namespace scope structured binding non-base + variable using get. */ +#define STATIC_INIT_DECOMP_NONBASE_P(NODE) \ + TREE_LANG_FLAG_2 (TREE_LIST_CHECK (NODE)) + /* A list of namespace-scope objects which have constructors or destructors which reside in the global scope. The decl is stored in the TREE_VALUE slot and the initializer is stored in the --- gcc/cp/decl.cc.jj 2024-09-09 11:50:07.146394047 +0200 +++ gcc/cp/decl.cc 2024-09-09 17:16:26.459094150 +0200 @@ -8485,6 +8485,7 @@ cp_finish_decl (tree decl, tree init, bo bool var_definition_p = false; tree auto_node; auto_vec extra_cleanups; + tree aggregates1 = NULL_TREE; struct decomp_cleanup { tree decl; cp_decomp *&decomp; @@ -8872,7 +8873,16 @@ cp_finish_decl (tree decl, tree init, bo } if (decomp) - cp_maybe_mangle_decomp (decl, decomp); + { + cp_maybe_mangle_decomp (decl, decomp); + if (TREE_STATIC (decl) && !DECL_FUNCTION_SCOPE_P (decl)) + { + if (CP_DECL_THREAD_LOCAL_P (decl)) + aggregates1 = tls_aggregates; + else + aggregates1 = static_aggregates; + } + } /* If this is a local variable that will need a mangled name, register it now. We must do this before pro
Re: [PATCH v2 2/4] RISC-V: Fix incorrect code gen for scalar signed SAT_ADD [PR117688]
On 1/23/25 12:01 AM, pan2...@intel.com wrote: From: Pan Li This patch would like to fix the wroing code generation for the scalar signed SAT_ADD. The input can be QI/HI/SI/DI while the alu like sub can only work on Xmode. Unfortunately we don't have sub/add for non-Xmode like QImode in scalar, thus we need to sign extend to Xmode to ensure we have the correct value before ALU like add. The gen_lowpart will generate something like lbu which has all zero for highest bits. For example, when 0xff(-1 for QImode) plus 0x2(1 for QImode), we actually want to -1 + 2 = 1, but if there is no sign extend like lbu, we will get 0xff + 2 = 0x101 which is incorrect. Thus, we have to sign extend 0xff(Qmode) to 0x(assume XImode is DImode) before plus in Xmode. The below test suites are passed for this patch. * The rv64gcv fully regression test. PR target/117688 gcc/ChangeLog: * config/riscv/riscv.cc (riscv_expand_ssadd): Leverage the helper riscv_extend_to_xmode_reg with SIGN_EXTEND. gcc/testsuite/ChangeLog: * gcc.target/riscv/pr117688-add-run-1-s16.c: New test. * gcc.target/riscv/pr117688-add-run-1-s32.c: New test. * gcc.target/riscv/pr117688-add-run-1-s64.c: New test. * gcc.target/riscv/pr117688-add-run-1-s8.c: New test. * gcc.target/riscv/pr117688.h: New test. Conceptually OK. We just need to get the helper fixed up properly, then retest before committing. jeff
Re: [PATCH v2 1/4] RISC-V: Refactor SAT_* operand rtx extend to reg help func [NFC]
On 1/23/25 12:01 AM, pan2...@intel.com wrote: From: Pan Li This patch would like to refactor the helper function of the SAT_* scalar. The helper function will convert the define_pattern ops to the xmode reg for the underlying code-gen. This patch add new parameter for ZERO_EXTEND or SIGN_EXTEND if the input is const_int or the mode is non-Xmode. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_gen_zero_extend_rtx): Rename from ... (riscv_extend_to_xmode_reg): Rename to and add rtx_code for zero/sign extend if non-Xmode. (riscv_expand_usadd): Leverage the renamed function with ZERO_EXTEND. (riscv_expand_ussub): Ditto. Note that I recently made a small change to riscv_gen_zero_extend_rtx that I think you need to incorporate into your patch. Otherwise I think you'll get a code quality regression on some of the saturation tests. Combine purposefully doesn't try to simplify expressions like (zero_extend (const_int ...)) or (sign_extend (const_int ...)). It's a historical wart, probably related to the lack of a mode on const_int objects IIRC. As a result if you ask the old code for an SImode 0x8000 you get a load of 0x8000 (lui) followed by a zero extend (two shifts). What you really want is a li to load 0x1, then a left shift. to produce 0x8000. This problem had been previously masked by the mvconst_internal pattern. The way to get this behavior is to take the incoming constant and mask off all the bits outside the desired mode. Then use gen_int_mode to actually generate a canonical const_int. Then force that into a register with force_reg. ie: /* Combine deliberately does not simplify extensions of constants (long story). So try to generate the zero extended constant efficiently. First extract the constant and mask off all the bits not in MODE. */ HOST_WIDE_INT val = INTVAL (x); val &= GET_MODE_MASK (mode); /* X may need synthesis, so do not blindly copy it. */ xmode_reg = force_reg (Xmode, gen_int_mode (val, Xmode)); I think the upstream ci system hasn't moved the baseline forward in about a week. As a result it's not reporting the failure to apply due to the conflict nor is it reporting the code quality regression. Jeff
Re: [PATCH v2 3/4] RISC-V: Fix incorrect code gen for scalar signed SAT_SUB [PR117688]
On 1/23/25 12:01 AM, pan2...@intel.com wrote: From: Pan Li This patch would like to fix the wroing code generation for the scalar signed SAT_SUB. The input can be QI/HI/SI/DI while the alu like sub can only work on Xmode. Unfortunately we don't have sub/add for non-Xmode like QImode in scalar, thus we need to sign extend to Xmode to ensure we have the correct value before ALU like sub. The gen_lowpart will generate something like lbu which has all zero for highest bits. For example, when 0xff(-1 for QImode) sub 0x1(1 for QImode), we actually want to -1 - 1 = -2, but if there is no sign extend like lbu, we will get 0xff - 1 = 0xfe which is incorrect. Thus, we have to sign extend 0xff(Qmode) to 0x(assume XImode is DImode) before sub in Xmode. The below test suites are passed for this patch. * The rv64gcv fully regression test. PR target/117688 gcc/ChangeLog: * config/riscv/riscv.cc (riscv_expand_sssub): Leverage the helper riscv_extend_to_xmode_reg with SIGN_EXTEND. gcc/testsuite/ChangeLog: * gcc.target/riscv/pr117688.h: Add test helper macro. * gcc.target/riscv/pr117688-sub-run-1-s16.c: New test. * gcc.target/riscv/pr117688-sub-run-1-s32.c: New test. * gcc.target/riscv/pr117688-sub-run-1-s64.c: New test. * gcc.target/riscv/pr117688-sub-run-1-s8.c: New test. Again, conceptually OK. We'll just need to make sure to retest after you adjust the helper. Similarly for patch #4 in this series. jeff
[committed, obvious] OpenMP: Fix typo in atomic directive error message
gcc/fortran/ChangeLog * openmp.cc (resolve_omp_atomic): Fix typo in error message. gcc/testsuite/ChangeLog * gfortran.dg/gomp/atomic-26.f90: Correct expected output after fixing typo in error message. --- gcc/fortran/openmp.cc| 2 +- gcc/testsuite/gfortran.dg/gomp/atomic-26.f90 | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc index be78aa1ab27..7875341b2cf 100644 --- a/gcc/fortran/openmp.cc +++ b/gcc/fortran/openmp.cc @@ -10410,7 +10410,7 @@ resolve_omp_atomic (gfc_code *code) gfc_intrinsic_op alt_op = INTRINSIC_NONE; if (atomic_code->ext.omp_clauses->fail != OMP_MEMORDER_UNSET) - gfc_error ("!$OMP ATOMIC UPDATE at %L with FAIL clause requiries either" + gfc_error ("!$OMP ATOMIC UPDATE at %L with FAIL clause requires either" " the COMPARE clause or using the intrinsic MIN/MAX " "procedure", &atomic_code->loc); switch (op) diff --git a/gcc/testsuite/gfortran.dg/gomp/atomic-26.f90 b/gcc/testsuite/gfortran.dg/gomp/atomic-26.f90 index 6448bd9b8bb..3d88cd72d8d 100644 --- a/gcc/testsuite/gfortran.dg/gomp/atomic-26.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/atomic-26.f90 @@ -38,11 +38,11 @@ real function bar (y, e, f) v = d !$omp atomic fail(relaxed), write! { dg-error "FAIL clause is incompatible with READ or WRITE" } d = v - !$omp atomic fail(relaxed) update! { dg-error "FAIL clause requiries either the COMPARE clause or using the intrinsic MIN/MAX procedure" } + !$omp atomic fail(relaxed) update! { dg-error "FAIL clause requires either the COMPARE clause or using the intrinsic MIN/MAX procedure" } d = d + 3.0 - !$omp atomic fail(relaxed) ! { dg-error "FAIL clause requiries either the COMPARE clause or using the intrinsic MIN/MAX procedure" } + !$omp atomic fail(relaxed) ! { dg-error "FAIL clause requires either the COMPARE clause or using the intrinsic MIN/MAX procedure" } d = d + 3.0 - !$omp atomic capture fail(relaxed) ! { dg-error "FAIL clause requiries either the COMPARE clause or using the intrinsic MIN/MAX procedure" } + !$omp atomic capture fail(relaxed) ! { dg-error "FAIL clause requires either the COMPARE clause or using the intrinsic MIN/MAX procedure" } v = d; d = d + 3.0 !$omp atomic read weak ! { dg-error "WEAK clause requires COMPARE clause" } v = d -- 2.34.1