Re: [PATCH 0/4] Fortran: Improve flow of intrinsics/library documentation [PR47928]
On 2/28/25 02:56, Andre Vehreschild wrote: Hi Sandra, thanks for taking on the laborious task. I have browsed over the changes and found: Patch 3 in intrinsic.texi: @@ -2071,6 +2071,9 @@ end program atomic @cindex Atomic subroutine, ADD with fetch @table @asis +@item @emph{Synopsis}: +@code{CALL ATOMIC_FETCH_ADD (ATOM, VALUE, old [, STAT])} + `old` should be uppercase here, too, for consistency. Yes, I know, that is nothing you changed. I just stumbled over it and while we are at it, let's address it. Same for: @@ -3074,6 +3074,9 @@ end program test_btest @cindex pointer, C association status @table @asis +@item @emph{Synopsis}: +@code{RESULT = C_ASSOCIATED(c_ptr_1[, c_ptr_2])} With uppercasing in the following paragraph needed, too. And I vote for using CPTR1 and CPTR2 instead. Same here: @@ -3177,6 +3177,9 @@ end program main @cindex pointer, C address of pointers @table @asis +@item @emph{Synopsis}: +@code{CALL C_F_PROCPOINTER(cptr, fptr)} and here: @@ -3235,6 +3235,9 @@ end program main @cindex pointer, C address of procedures @table @asis +@item @emph{Synopsis}: +@code{RESULT = C_FUNLOC(x)} + I'd say: "Ok, I'll stop." here, but that is the list of changes needed to get the description in intrinsic.texi neat. In part 4 of your patch, can you rephrase: @@ -1118,6 +1114,10 @@ program test_allocated if (.not. allocated(x)) allocate(x(i)) end program test_allocated @end smallexample + +@item @emph{Standard}: +Fortran 90 and later. Note, the @code{SCALAR=} keyword and allocatable +scalar entities are available in Fortran 2003 and later. @end table to +Fortran 90 and later; for @code{SCALAR=} keyword and allocatable +scalar entities Fortran 2003 and later. Just for consistency. With these changes, ok for mainline. Thank you very much for taking on that laborious task. My deepest respect! Thanks for the review! I've pushed the changes now, along with the attached additional patch to address those existing minor issues you identified. As I said, there are a lot of remaining markup and formatting problems in there as well. :-( -Sandra From 3cfe5832d049c55cacc5f73431a4a14e97b2659f Mon Sep 17 00:00:00 2001 From: Sandra Loosemore Date: Sun, 2 Mar 2025 01:43:26 + Subject: [PATCH] Fortran: Small fixes in intrinsic.texi. gcc/fortran/ChangeLog * intrinsic.texi: Fix inconsistent capitalization of argument names and other minor copy-editing. --- gcc/fortran/intrinsic.texi | 26 +- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/gcc/fortran/intrinsic.texi b/gcc/fortran/intrinsic.texi index 4e6d2faea31..8c160e58b00 100644 --- a/gcc/fortran/intrinsic.texi +++ b/gcc/fortran/intrinsic.texi @@ -1116,8 +1116,8 @@ end program test_allocated @end smallexample @item @emph{Standard}: -Fortran 90 and later. Note, the @code{SCALAR=} keyword and allocatable -scalar entities are available in Fortran 2003 and later. +Fortran 90 and later; for the @code{SCALAR=} keyword and allocatable +scalar entities, Fortran 2003 and later. @end table @@ -2072,7 +2072,7 @@ Fortran 2008 and later; with @var{STAT}, TS 18508 or later @table @asis @item @emph{Synopsis}: -@code{CALL ATOMIC_FETCH_ADD (ATOM, VALUE, old [, STAT])} +@code{CALL ATOMIC_FETCH_ADD (ATOM, VALUE, OLD [, STAT])} @item @emph{Description}: @code{ATOMIC_FETCH_ADD(ATOM, VALUE, OLD)} atomically stores the value of @@ -3075,24 +3075,24 @@ for @code{UNSIGNED} (@pxref{Unsigned integers}) @table @asis @item @emph{Synopsis}: -@code{RESULT = C_ASSOCIATED(c_ptr_1[, c_ptr_2])} +@code{RESULT = C_ASSOCIATED(CPTR1[, CPTR2])} @item @emph{Description}: -@code{C_ASSOCIATED(c_ptr_1[, c_ptr_2])} determines the status of the C pointer -@var{c_ptr_1} or if @var{c_ptr_1} is associated with the target @var{c_ptr_2}. +@code{C_ASSOCIATED(CPTR1[, CPTR2])} determines the status of the C pointer +@var{CPTR1} or if @var{CPTR1} is associated with the target @var{CPTR2}. @item @emph{Class}: Inquiry function @item @emph{Arguments}: @multitable @columnfractions .15 .70 -@item @var{c_ptr_1} @tab Scalar of the type @code{C_PTR} or @code{C_FUNPTR}. -@item @var{c_ptr_2} @tab (Optional) Scalar of the same type as @var{c_ptr_1}. +@item @var{CPTR1} @tab Scalar of the type @code{C_PTR} or @code{C_FUNPTR}. +@item @var{CPTR2} @tab (Optional) Scalar of the same type as @var{CPTR1}. @end multitable @item @emph{Return value}: The return value is of type @code{LOGICAL}; it is @code{.false.} if either -@var{c_ptr_1} is a C NULL pointer or if @var{c_ptr1} and @var{c_ptr_2} +@var{CPTR1} is a C NULL pointer or if @var{CPTR1} and @var{CPTR2} point to different addresses. @item @emph{Example}: @@ -3178,7 +3178,7 @@ Fortran 2003 and later @table @asis @item @emph{Synopsis}: -@code{CALL C_F_PROCPOINTER(cptr, fptr)} +@code{CALL C_F_PROCPOINTER(CPTR, FPTR)} @item @emph{Description}: @code{C_F_PROCPOINTER(CPTR, FPTR)} Assign the target of the C function pointer @@ -3236,17 +3236,17 @@
[PATCH] Fortran: reject empty derived type with bind(C) attribute [PR101577]
Dear all, due to an oversight in the Fortran standard before 2018, empty derived types with bind(C) attribute were explicitly (deliberately?) accepted by gfortran, giving a warning that the companion processor might not provide an interoperating entity. In the PR, Tobias pointed to a discussion on the J3 ML that there was a defect in older standards. The attached patch now generates an error when -std=f20xx is specified, and continues to generate a warning otherwise. Regtested on x86_64-pc-linux-gnu. OK for mainline? Thanks, Harald From 5c38ce50ed7cca905401f6fa6506b47fd79a7739 Mon Sep 17 00:00:00 2001 From: Harald Anlauf Date: Sun, 2 Mar 2025 22:20:28 +0100 Subject: [PATCH] Fortran: reject empty derived type with bind(C) attribute [PR101577] PR fortran/101577 gcc/fortran/ChangeLog: * symbol.cc (verify_bind_c_derived_type): Generate error message for derived type with no components in standard conformance mode, indicating that this is a GNU extension. gcc/testsuite/ChangeLog: * gfortran.dg/empty_derived_type.f90: Adjust dg-options. * gfortran.dg/empty_derived_type_2.f90: New test. --- gcc/fortran/symbol.cc | 22 --- .../gfortran.dg/empty_derived_type.f90| 1 + .../gfortran.dg/empty_derived_type_2.f90 | 11 ++ 3 files changed, 31 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/empty_derived_type_2.f90 diff --git a/gcc/fortran/symbol.cc b/gcc/fortran/symbol.cc index c6894810bce..9ddf13b3f0d 100644 --- a/gcc/fortran/symbol.cc +++ b/gcc/fortran/symbol.cc @@ -4624,12 +4624,28 @@ verify_bind_c_derived_type (gfc_symbol *derived_sym) entity may be defined by means of C and the Fortran entity is said to be interoperable with the C entity. There does not have to be such an interoperating C entity." + + However, later discussion on the J3 mailing list + (https://mailman.j3-fortran.org/pipermail/j3/2021-July/013190.html) + found this to be a defect, and Fortran 2018 added in section 18.3.4 + the following constraint: + "C1805: A derived type with the BIND attribute shall have at least one + component." + + We thus allow empty derived types only as GNU extension while giving a + warning by default, or reject empty types in standard conformance mode. */ if (curr_comp == NULL) { - gfc_warning (0, "Derived type %qs with BIND(C) attribute at %L is empty, " - "and may be inaccessible by the C companion processor", - derived_sym->name, &(derived_sym->declared_at)); + if (!gfc_notify_std (GFC_STD_GNU, "Derived type %qs with BIND(C) " + "attribute at %L has no components", + derived_sym->name, &(derived_sym->declared_at))) + return false; + else if (!pedantic) + gfc_warning (0, "Derived type %qs with BIND(C) attribute at %L " + "is empty, and may be inaccessible by the C " + "companion processor", + derived_sym->name, &(derived_sym->declared_at)); derived_sym->ts.is_c_interop = 1; derived_sym->attr.is_bind_c = 1; return true; diff --git a/gcc/testsuite/gfortran.dg/empty_derived_type.f90 b/gcc/testsuite/gfortran.dg/empty_derived_type.f90 index 6bf616c2c6a..496262de2cd 100644 --- a/gcc/testsuite/gfortran.dg/empty_derived_type.f90 +++ b/gcc/testsuite/gfortran.dg/empty_derived_type.f90 @@ -1,4 +1,5 @@ ! { dg-do compile } +! { dg-options "" } module stuff implicit none type, bind(C) :: junk ! { dg-warning "may be inaccessible by the C companion" } diff --git a/gcc/testsuite/gfortran.dg/empty_derived_type_2.f90 b/gcc/testsuite/gfortran.dg/empty_derived_type_2.f90 new file mode 100644 index 000..1ef56da4c25 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/empty_derived_type_2.f90 @@ -0,0 +1,11 @@ +! { dg-do compile } +! { dg-additional-options "-std=f2018" } +! +! PR fortran/101577 +! +! Contributed by Tobias Burnus + +type, bind(C) :: t ! { dg-error "has no components" } + ! Empty! +end type t +end -- 2.43.0
Re: [PATCH] rtl: Remove invalid compare simplification [PR117186]
On 2025-01-13 17:48, Tobias Burnus wrote: Andreas Schwab wrote: This breaks m68k: Same issue on GCN, hence I filed https://gcc.gnu.org/PR118418 This also breaks gcc bootstrap on riscv64 under some specific configuration: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119012#c12 Although it doesn't seem to be the same issue here. If I look at the debugging output, see PR, it seems as if the self-test function test_comparisons contains the assumption: FALSE < TRUE but if TRUE is -1, that assumption does not hold (for signed variables). And both GCN and m68k '#define STORE_FLAG_VALUE -1', as Andreas noted. Tobias
[PATCH] OpenMP: Integrate dynamic selectors with dispatch argument handling [PR118457]
Support for dynamic selectors in "declare variant" was developed in parallel with support for the adjust_args/append_args clauses and the dispatch construct; they collided in a bad way. This patch fixes the "sorry" for calls that need both by removing the adjust_args/append_args code from gimplify_call_expr and invoking it from the new variant substitution code instead. It's handled as a tree -> tree transformation rather than tree -> gimple because eventually this code may end up being invoked from the front ends instead of the gimplifier (see PR115076). gcc/ChangeLog PR middle-end/118457 * gimplify.cc (modify_call_for_omp_dispatch): New, containing code split from gimplify_call_expr and modified to emit tree instead of gimple. Remove the error for falling through to a call to the base function. (expand_variant_call_expr): New, split from gimplify_variant_call_expr. Call modify_call_for_omp_dispatch on calls to variants in a dispatch construct context. (gimplify_variant_call_expr): Make it call expand_variant_call_expr to do the actual work. (gimplify_call_expr): Remove sorry for calls involving both dynamic/late selectors and adjust_args/append_args, and adjust for new interface. Move adjust_args/append_args code to modify_call_for_omp_dispatch. (gimplify_omp_dispatch): Add some comments. gcc/testsuite/ChangeLog PR middle-end/118457 * c-c++-common/gomp/adjust-args-6.c: Remove xfails and adjust expected output. * c-c++-common/gomp/append-args-5.c: Adjust expected output. * c-c++-common/gomp/append-args-dynamic.c: New. * c-c++-common/gomp/dispatch-11.c: Adjust expected output. * gfortran.dg/gomp/dispatch-11.f90: Likewise. --- gcc/gimplify.cc | 815 +- .../c-c++-common/gomp/adjust-args-6.c | 13 +- .../c-c++-common/gomp/append-args-5.c | 19 +- .../c-c++-common/gomp/append-args-dynamic.c | 80 ++ gcc/testsuite/c-c++-common/gomp/dispatch-11.c | 22 +- .../gfortran.dg/gomp/dispatch-11.f90 | 5 - 6 files changed, 487 insertions(+), 467 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/gomp/append-args-dynamic.c diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index 160e7fc9df6..5852c618b05 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -3872,29 +3872,331 @@ find_supercontext (void) return NULL_TREE; } +/* OpenMP: Handle the append_args and adjust_args clauses of + declare_variant for EXPR, which is a CALL_EXPR whose CALL_EXPR_FN + is the variant, within a dispatch construct with clauses DISPATCH_CLAUSES + and location DISPATCH_LOC. + + 'append_args' causes interop objects are added after the last regular + (nonhidden, nonvariadic) arguments of the variant function. + 'adjust_args' with need_device_{addr,ptr} converts the pointer target of + a pointer from a host to a device address. This uses either the default + device or the passed device number, which then sets the default device + address. */ +static tree +modify_call_for_omp_dispatch (tree expr, tree dispatch_clauses, + location_t dispatch_loc) +{ + tree fndecl = get_callee_fndecl (expr); + + /* Skip processing if we don't get the expected call form. */ + if (!fndecl) +return expr; + + int nargs = call_expr_nargs (expr); + tree dispatch_device_num = NULL_TREE; + tree dispatch_device_num_init = NULL_TREE; + tree dispatch_interop = NULL_TREE; + tree dispatch_append_args = NULL_TREE; + int nfirst_args = 0; + tree dispatch_adjust_args_list += lookup_attribute ("omp declare variant variant args", + DECL_ATTRIBUTES (fndecl)); + + if (dispatch_adjust_args_list) +{ + dispatch_adjust_args_list = TREE_VALUE (dispatch_adjust_args_list); + dispatch_append_args = TREE_CHAIN (dispatch_adjust_args_list); + if (TREE_PURPOSE (dispatch_adjust_args_list) == NULL_TREE + && TREE_VALUE (dispatch_adjust_args_list) == NULL_TREE) + dispatch_adjust_args_list = NULL_TREE; +} + if (dispatch_append_args) +{ + nfirst_args = tree_to_shwi (TREE_PURPOSE (dispatch_append_args)); + dispatch_append_args = TREE_VALUE (dispatch_append_args); +} + dispatch_device_num = omp_find_clause (dispatch_clauses, OMP_CLAUSE_DEVICE); + if (dispatch_device_num) +dispatch_device_num = OMP_CLAUSE_DEVICE_ID (dispatch_device_num); + dispatch_interop = omp_find_clause (dispatch_clauses, OMP_CLAUSE_INTEROP); + int nappend = 0, ninterop = 0; + for (tree t = dispatch_append_args; t; t = TREE_CHAIN (t)) +nappend++; + + /* FIXME: error checking should be taken out of this function and + handled before any attempt at filtering or resolution happens. + Otherwise whether or not diagnostics appear is determined by + GCC internals, how good the front ends are at constant-fo
[PATCH 01/17] LoongArch: (NFC) Remove atomic_optab and use amop instead
They are the same. gcc/ChangeLog: * config/loongarch/sync.md (atomic_optab): Remove. (atomic_): Change atomic_optab to amop. (atomic_fetch_): Likewise. --- gcc/config/loongarch/sync.md | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/gcc/config/loongarch/sync.md b/gcc/config/loongarch/sync.md index fd8d732dd67..75b134cd853 100644 --- a/gcc/config/loongarch/sync.md +++ b/gcc/config/loongarch/sync.md @@ -35,8 +35,6 @@ (define_c_enum "unspec" [ ]) (define_code_iterator any_atomic [plus ior xor and]) -(define_code_attr atomic_optab - [(plus "add") (ior "or") (xor "xor") (and "and")]) ;; This attribute gives the format suffix for atomic memory operations. (define_mode_attr amo [(QI "b") (HI "h") (SI "w") (DI "d")]) @@ -175,7 +173,7 @@ (define_insn "atomic_store" } [(set (attr "length") (const_int 12))]) -(define_insn "atomic_" +(define_insn "atomic_" [(set (match_operand:GPR 0 "memory_operand" "+ZB") (unspec_volatile:GPR [(any_atomic:GPR (match_dup 0) @@ -197,7 +195,7 @@ (define_insn "atomic_add" "amadd%A2.\t$zero,%z1,%0" [(set (attr "length") (const_int 4))]) -(define_insn "atomic_fetch_" +(define_insn "atomic_fetch_" [(set (match_operand:GPR 0 "register_operand" "=&r") (match_operand:GPR 1 "memory_operand" "+ZB")) (set (match_dup 1) -- 2.48.1
[to-be-committed][RISC-V][PR target/118934] Fix ICE in RISC-V long branch supportvi !$
I'm not sure if I goof'd this or if I merely upstreamed someone else's goof. Either way the long branch code isn't working correctly. We were using 'n' as the output modifier to negate the condition. But 'n' has a special meaning elsewhere, so when presented with a condition rather than what was expected, boom, the compiler ICE'd. Thankfully there's only a few places where we were using %n which I turned into %r. The BZ entry includes a good testcase, it just takes a long time to compile as it's trying to create the out-of-range scenario. I'm not including the testcase due to how long it takes, but I did test it locally to ensure it's working properly now. I'm sure that with a little bit of work I could create at testcase that worked before and fails with the trunk (by taking advantage of the fuzzyness in length computations). So I'm going to consider this a regression. Will push to the trunk after pre-commit testing does its thing. Jeff PR target/188934 gcc/ * config/riscv/corev.md (cv_branch): Adjust output template. (branch): Likewise. * config/risc/riscv.md (branch): Likewise. * config/risc/riscv.cc (riscv_asm_output_opcode): Handle 'r' rather than 'n'. diff --git a/gcc/config/riscv/corev.md b/gcc/config/riscv/corev.md index e44fdc1129d..d1c3aaa973e 100644 --- a/gcc/config/riscv/corev.md +++ b/gcc/config/riscv/corev.md @@ -2627,7 +2627,7 @@ (define_insn "*cv_branch" "TARGET_XCVBI" { if (get_attr_length (insn) == 12) -return "cv.b%n1\t%2,%z3,1f; jump\t%l0,ra; 1:"; +return "cv.b%r1\t%2,%z3,1f; jump\t%l0,ra; 1:"; return "cv.b%C1imm\t%2,%3,%0"; } @@ -2645,7 +2645,7 @@ (define_insn "*branch" "TARGET_XCVBI" { if (get_attr_length (insn) == 12) -return "b%n1\t%2,%z3,1f; jump\t%l0,ra; 1:"; +return "b%r1\t%2,%z3,1f; jump\t%l0,ra; 1:"; return "b%C1\t%2,%z3,%l0"; } diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 89aa25d5da9..38f3ae7cd84 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -6868,7 +6868,7 @@ riscv_asm_output_opcode (FILE *asm_out_file, const char *p) any outermost HIGH. 'R' Print the low-part relocation associated with OP. 'C' Print the integer branch condition for comparison OP. - 'n' Print the inverse of the integer branch condition for comparison OP. + 'r' Print the inverse of the integer branch condition for comparison OP. 'A' Print the atomic operation suffix for memory model OP. 'I' Print the LR suffix for memory model OP. 'J' Print the SC suffix for memory model OP. @@ -7027,7 +7027,7 @@ riscv_print_operand (FILE *file, rtx op, int letter) fputs (GET_RTX_NAME (code), file); break; -case 'n': +case 'r': /* The RTL names match the instruction names. */ fputs (GET_RTX_NAME (reverse_condition (code)), file); break; diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index f7070766783..95951605fb4 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -3252,7 +3252,7 @@ (define_insn "*branch" "!TARGET_XCVBI" { if (get_attr_length (insn) == 12) -return "b%n1\t%2,%z3,1f; jump\t%l0,ra; 1:"; +return "b%r1\t%2,%z3,1f; jump\t%l0,ra; 1:"; return "b%C1\t%2,%z3,%l0"; }
New Swedish PO file for 'gcc' (version 15-b20250216)
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'gcc' has been submitted by the Swedish team of translators. The file is available at: https://translationproject.org/latest/gcc/sv.po (This file, 'gcc-15-b20250216.sv.po', has just now been sent to you in a separate email.) All other PO files for your package are available in: https://translationproject.org/latest/gcc/ Please consider including all of these in your next release, whether official or a pretest. Whenever you have a new distribution with a new version number ready, containing a newer POT file, please send the URL of that distribution tarball to the address below. The tarball may be just a pretest or a snapshot, it does not even have to compile. It is just used by the translators when they need some extra translation context. The following HTML page has been updated: https://translationproject.org/domain/gcc.html If any question arises, please contact the translation coordinator. Thank you for all your work, The Translation Project robot, in the name of your translation coordinator.
[patch,avr] texi: Add new subsubsection "AVR Optimization Options"
This patch adds a new section "AVR Optimization Options" in the texi documentation. Ok for trunk? Johann -- AVR: Add texi @subsubsection "AVR Optimization Options". gcc/ * doc/invoke.texi (AVR Optimization Options): New @subsubsection for pure optimization options. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 0c7adc039b5..eaf1727f88c 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -24349,33 +24349,6 @@ instructions. This option has only an effect on reduced Tiny devices like ATtiny40. See also the @code{absdata} @ref{AVR Variable Attributes,variable attribute}. -@opindex maccumulate-args -@item -maccumulate-args -Accumulate outgoing function arguments and acquire/release the needed -stack space for outgoing function arguments once in function -prologue/epilogue. Without this option, outgoing arguments are pushed -before calling a function and popped afterwards. - -Popping the arguments after the function call can be expensive on -AVR so that accumulating the stack space might lead to smaller -executables because arguments need not be removed from the -stack after such a function call. - -This option can lead to reduced code size for functions that perform -several calls to functions that get their arguments on the stack like -calls to printf-like functions. - -@opindex mbranch-cost -@item -mbranch-cost=@var{cost} -Set the branch costs for conditional branch instructions to -@var{cost}. Reasonable values for @var{cost} are small, non-negative -integers. The default branch cost is 0. - -@opindex mcall-prologues -@item -mcall-prologues -Functions prologues/epilogues are expanded as calls to appropriate -subroutines. Code size is smaller. - @opindex mcvt @item -mcvt Use a @emph{compact vector table}. Some devices support a CVT @@ -24393,27 +24366,6 @@ For example, you can link with @code{-Wl,--defsym,__init_cvt=0}. The CVT startup code is available since @w{@uref{https://github.com/avrdudes/avr-libc/issues/1010,AVR-LibC v2.3}}. -@opindex mfuse-add -@item -mfuse-add -@itemx -mno-fuse-add -@itemx -mfuse-add=@var{level} -Optimize indirect memory accesses on reduced Tiny devices. -The default uses @code{@var{level}=1} for optimizations @option{-Og} -and @option{-O1}, and @code{@var{level}=2} for higher optimizations. -Valid values for @var{level} are @code{0}, @code{1} and @code{2}. - -@opindex mfuse-move -@item -mfuse-move -@itemx -mno-fuse-move -@itemx -mfuse-move=@var{level} -Run a post reload optimization pass that tries to fuse move instructions -and to split multi-byte instructions into 8-bit operations. -The default uses @code{@var{level}=3} for optimization @option{-O1}, -and @code{@var{level}=23} for higher optimizations. -Valid values for @var{level} are in the range @code{0} @dots{} @code{23} -which is a 3:2:2:2 mixed radix value. Each digit controls some -aspect of the optimization. - @opindex mdouble @opindex mlong-double @item -mdouble=@var{bits} @@ -24502,39 +24454,6 @@ support (@w{@uref{https://sourceware.org/PR31124,PR31124}}) is available. In that case, @option{-mrodata-in-ram} can be used to return to the old layout with @code{.rodata} in RAM. -@opindex mstrict-X -@item -mstrict-X -Use address register @code{X} in a way proposed by the hardware. This means -that @code{X} is only used in indirect, post-increment or -pre-decrement addressing. - -Without this option, the @code{X} register may be used in the same way -as @code{Y} or @code{Z} which then is emulated by additional -instructions. -For example, loading a value with @code{X+const} addressing with a -small non-negative @code{const < 64} to a register @var{Rn} is -performed as - -@example -adiw r26, const ; X += const -ld @var{Rn}, X; @var{Rn} = *X -sbiw r26, const ; X -= const -@end example - -@opindex msplit-bit-shift -@item -msplit-bit-shift -Split multi-byte shifts with a constant offset into a shift with -a byte offset and a residual shift with a non-byte offset. -This optimization is turned on per default for @option{-O2} and higher, -including @option{-Os} but excluding @option{-Oz}. -Splitting of shifts with a constant offset that is -a multiple of 8 is controlled by @option{-mfuse-move}. - -@opindex msplit-ldst -@item -msplit-ldst -Split multi-byte loads and stores into several byte loads and stores. -This optimization is turned on per default for @option{-O2} and higher. - @opindex mtiny-stack @item -mtiny-stack Only change the lower 8@tie{}bits of the stack pointer. @@ -24586,6 +24505,98 @@ Warn if the ISR is misspelled, i.e.@: without __vector prefix. Enabled by default. @end table + +@subsubsection AVR Optimization Options +The following options are pure optimization options. +Options @option{-mgas-isr-prologues}, @option{-mmain-is-OS_task}, +@option{-mno-call-main} and @option{-mrelax} from above are only +@emph{almost} optimization options, since there are rare occasions +where their different code generation matters. + +@table
[PATCH] LoongArch: Fix incorrect reorder of __lsx_vldx and __lasx_xvldx [PR119084]
They could be incorrectly reordered with store instructions like st.b because the RTL expression does not have a memory_operand or a (mem) expression. The incorrect reorder has been observed in openh264 LTO build. Expand them to a (mem) expression instead of unspec to fix the issue. Then we need to make loongarch_address_insns return 1 for ADDRESS_REG_REG because the constraint "R" expects this behavior, or the vldx instruction will be considered invalid by the register allocate pass and turned to add.d + vld. Apply the ADDRESS_REG_REG penalty in loongarch_address_cost instead, loongarch_rtx_costs should also call loongarch_address_cost instead of loongarch_address_insns then. Closes: https://github.com/cisco/openh264/issues/3857 gcc/ChangeLog: PR target/119084 * config/loongarch/lasx.md (UNSPEC_LASX_XVLDX): Remove. (lasx_xvldx): Remove. * config/loongarch/lsx.md (UNSPEC_LSX_VLDX): Remove. (lsx_vldx): Remove. * config/loongarch/simd.md (QIVEC): New define_mode_iterator. (_vldx): New define_expand. * config/loongarch/loongarch.cc (loongarch_address_insns_1): New static function with most logic factored out from ... (loongarch_address_insns): ... here. Call loongarch_address_insns_1 with reg_reg_cost = 1. (loongarch_address_cost): Call loongarch_address_insns_1 with reg_reg_cost = la_addr_reg_reg_cost. gcc/testsuite/ChangeLog: PR target/119084 * gcc.target/loongarch/pr119084.c: New test. --- Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk and gcc-14 branch? gcc/config/loongarch/lasx.md | 13 - gcc/config/loongarch/loongarch.cc | 48 +++ gcc/config/loongarch/lsx.md | 13 - gcc/config/loongarch/simd.md | 9 gcc/testsuite/gcc.target/loongarch/pr119084.c | 24 ++ 5 files changed, 61 insertions(+), 46 deletions(-) create mode 100644 gcc/testsuite/gcc.target/loongarch/pr119084.c diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md index e4505c1660d..43e3ab0026a 100644 --- a/gcc/config/loongarch/lasx.md +++ b/gcc/config/loongarch/lasx.md @@ -119,7 +119,6 @@ (define_c_enum "unspec" [ UNSPEC_LASX_XVSSRLRN UNSPEC_LASX_XVEXTL_QU_DU UNSPEC_LASX_XVLDI - UNSPEC_LASX_XVLDX UNSPEC_LASX_XVSTX UNSPEC_LASX_VECINIT_MERGE UNSPEC_LASX_VEC_SET_INTERNAL @@ -3579,18 +3578,6 @@ (define_insn "lasx_xvldi" [(set_attr "type" "simd_load") (set_attr "mode" "V4DI")]) -(define_insn "lasx_xvldx" - [(set (match_operand:V32QI 0 "register_operand" "=f") - (unspec:V32QI [(match_operand:DI 1 "register_operand" "r") - (match_operand:DI 2 "reg_or_0_operand" "rJ")] - UNSPEC_LASX_XVLDX))] - "ISA_HAS_LASX" -{ - return "xvldx\t%u0,%1,%z2"; -} - [(set_attr "type" "simd_load") - (set_attr "mode" "V32QI")]) - (define_insn "lasx_xvstx" [(set (mem:V32QI (plus:DI (match_operand:DI 1 "register_operand" "r") (match_operand:DI 2 "reg_or_0_operand" "rJ"))) diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index eb3baac7019..3779e283f8d 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/config/loongarch/loongarch.cc @@ -2363,14 +2363,9 @@ loongarch_index_address_p (rtx addr, machine_mode mode ATTRIBUTE_UNUSED) return true; } -/* Return the number of instructions needed to load or store a value - of mode MODE at address X. Return 0 if X isn't valid for MODE. - Assume that multiword moves may need to be split into word moves - if MIGHT_SPLIT_P, otherwise assume that a single load or store is - enough. */ - -int -loongarch_address_insns (rtx x, machine_mode mode, bool might_split_p) +static int +loongarch_address_insns_1 (rtx x, machine_mode mode, bool might_split_p, + int reg_reg_cost) { struct loongarch_address_info addr; int factor; @@ -2405,7 +2400,7 @@ loongarch_address_insns (rtx x, machine_mode mode, bool might_split_p) return factor; case ADDRESS_REG_REG: - return factor * la_addr_reg_reg_cost; + return factor * reg_reg_cost; case ADDRESS_CONST_INT: return lsx_p ? 0 : factor; @@ -2420,6 +2415,18 @@ loongarch_address_insns (rtx x, machine_mode mode, bool might_split_p) return 0; } +/* Return the number of instructions needed to load or store a value + of mode MODE at address X. Return 0 if X isn't valid for MODE. + Assume that multiword moves may need to be split into word moves + if MIGHT_SPLIT_P, otherwise assume that a single load or store is + enough. */ + +int +loongarch_address_insns (rtx x, machine_mode mode, bool might_split_p) +{ + return loongarch_address_insns_1 (x, mode, might_split_p, 1); +} + /* Return true if X fits within an unsigned field of BITS bits that is shifted left SHIFT bits before being used. */
[to-be-committed][RISC-V][PR target/116256] Fix minor code quality regression in reassociated arithmetic
The patch for target/116256 significantly simplified the condition and, I guess not too surprisingly, exposed a minor code quality regression. Specifically the split part of the define_insn_and_split only splits after reload (because we use a match_scratch). So there's nothing to combine the load-immediate with the subsequent add into an addi when the immediate fits into a simm12 field. This patch adjusts the split code to handle that scenario directly and generate the more efficient code. We can squeeze out the slli in this test with a bit more work, but that's out of scope right now since that isn't a regression. Tested in my tester. Waiting on pre-commit testing to render a verdict. jeffPR target/116256 gcc/ * config/riscv/riscv.md (reassociating constant addition): Adjust split code to generate addi directly when possible. gcc/testsuite * gcc.target/riscv/pr116256-1.c: New test. diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 95951605fb4..84bce409bc7 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -4684,10 +4684,22 @@ (define_insn_and_split "" "(TARGET_64BIT && riscv_const_insns (operands[3], false) == 1)" "#" "&& reload_completed" - [(set (match_dup 0) (ashift:DI (match_dup 1) (match_dup 2))) - (set (match_dup 4) (match_dup 3)) - (set (match_dup 0) (plus:DI (match_dup 0) (match_dup 4)))] - "" + [(const_int 0)] + "{ + rtx x = gen_rtx_ASHIFT (DImode, operands[1], operands[2]); + emit_insn (gen_rtx_SET (operands[0], x)); + + /* If the constant fits in a simm12, use it directly as we do not + get another good chance to optimize things again. */ + if (!SMALL_OPERAND (INTVAL (operands[3]))) + emit_move_insn (operands[4], operands[3]); + else + operands[4] = operands[3]; + + x = gen_rtx_PLUS (DImode, operands[0], operands[4]); + emit_insn (gen_rtx_SET (operands[0], x)); + DONE; + }" [(set_attr "type" "arith")]) (define_insn_and_split "" @@ -4700,13 +4712,26 @@ (define_insn_and_split "" "(TARGET_64BIT && riscv_const_insns (operands[3], false) == 1)" "#" "&& reload_completed" - [(set (match_dup 0) (ashift:DI (match_dup 1) (match_dup 2))) - (set (match_dup 4) (match_dup 3)) - (set (match_dup 0) (sign_extend:DI (plus:SI (match_dup 5) (match_dup 6] + [(const_int 0)] "{ operands[1] = gen_lowpart (DImode, operands[1]); operands[5] = gen_lowpart (SImode, operands[0]); operands[6] = gen_lowpart (SImode, operands[4]); + + rtx x = gen_rtx_ASHIFT (DImode, operands[1], operands[2]); + emit_insn (gen_rtx_SET (operands[0], x)); + + /* If the constant fits in a simm12, use it directly as we do not + get another good chance to optimize things again. */ + if (!SMALL_OPERAND (INTVAL (operands[3]))) + emit_move_insn (operands[4], operands[3]); + else + operands[6] = operands[3]; + + x = gen_rtx_PLUS (SImode, operands[5], operands[6]); + x = gen_rtx_SIGN_EXTEND (DImode, x); + emit_insn (gen_rtx_SET (operands[0], x)); + DONE; }" [(set_attr "type" "arith")]) diff --git a/gcc/testsuite/gcc.target/riscv/pr116256-1.c b/gcc/testsuite/gcc.target/riscv/pr116256-1.c new file mode 100644 index 000..9543716cd68 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/pr116256-1.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcb -mabi=lp64d" { target { rv64 } } } */ + + +bool f1(long a) +{ +long b = a << 4; +return b == -128; +} + +/* We want to verify that we have generated addi + rather than li+add. */ +/* { dg-final { scan-assembler-not "add\t" } } */ +/* { dg-final { scan-assembler "addi\t" } } */ +