[PATCH] i386: Add br_mispredict_scale in cost table.

2025-01-06 Thread Hongyu Wang
Hi, For later processors, the pipeline went deeper so the penalty for untaken branch can be larger than before. Add a new parameter br_mispredict_scale to describe the penalty, and adopt to noce_max_ifcvt_seq_cost hook to allow longer sequence to be converted with cmove. This improves cpu2017 544

Re: [PATCH v2 6/7] Alpha: Add option to avoid data races for sub-longword memory stores [PR117759]

2025-01-06 Thread Linus Torvalds
On Mon, 6 Jan 2025 at 16:59, Linus Torvalds wrote: > > There is absolutely no gray area here. It was always buggy, and the > alpha architecture was always completely and fundamentally > mis-designed. Note that I really do want to re-emphasize that while I think it's kind of interesting that Macie

[PATCH v1] LoongArch: Opitmize the cost of vec_construct.

2025-01-06 Thread chenxiaolong
When analyzing 525 on LoongArch architecture, it was found that the for loop of hotspot function x264_pixel_satd_8x4 could not be quantized 256-bit due to the cost of vec_construct setting. After re-adjusting vec_construct, the performance of 525 program was improved by 16.57%. It was found that

Re: [PATCH v2 6/7] Alpha: Add option to avoid data races for sub-longword memory stores [PR117759]

2025-01-06 Thread Jeff Law
On 1/6/25 6:03 AM, Maciej W. Rozycki wrote: With non-BWX Alpha implementations we have a problem of data races where a 8-bit byte or 16-bit word quantity is to be written to memory in that in those cases we use an unprotected RMW access of a 32-bit longword or 64-bit quadword width. If conten

RE: [PATCH v1 1/4] Match: Refactor the signed SAT_SUB match patterns [NFC]

2025-01-06 Thread Li, Pan2
Kindly ping for the series. Pan -Original Message- From: Li, Pan2 Sent: Monday, December 23, 2024 3:09 PM To: gcc-patches@gcc.gnu.org Cc: richard.guent...@gmail.com; tamar.christ...@arm.com; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com; rdapp@gmail.com Subject:

[PATCH] LoongArch: Adjust the cost of ADDRESS_REG_REG [PR114978].

2025-01-06 Thread Lulu Cheng
After changing this cost from 1 to 3, the performance of spec2006 401 473 416 465 482 can be improved by about 2% on LA664. Add option '-maddr-reg-reg-cost='. gcc/ChangeLog: * config/loongarch/genopts/loongarch.opt.in: Add option '-maddr-reg-reg-cost='. * config/loongarch

Re: [PATCH v2 6/7] Alpha: Add option to avoid data races for sub-longword memory stores [PR117759]

2025-01-06 Thread Jeff Law
On 1/6/25 5:59 PM, Linus Torvalds wrote: On Mon, 6 Jan 2025 at 16:13, Jeff Law wrote: But in the case of concurrent accesses, shouldn't these objects be declared as atomic? No. They aren't concurrent accesses to the same variable. They are concurrent accesses to *different* memory locat

Re: [PATCH 1/2] testsuite: add testcase for fixed PR117546

2025-01-06 Thread Sam James
Jeff Law writes: > On 1/3/25 11:11 AM, Sam James wrote: >> Sam James writes: >> >>> PR117546 was fixed by Eric's r14-10693-gadab597af288d6 change, but >>> the testcase here is sufficiently different to be worth including >>> in torture/. >>> >>> gcc/testsuite/ChangeLog: >>> PR ipa/117546 >>

Re: [PATCH v2 6/7] Alpha: Add option to avoid data races for sub-longword memory stores [PR117759]

2025-01-06 Thread Linus Torvalds
On Mon, 6 Jan 2025 at 16:13, Jeff Law wrote: > > But in the case of concurrent accesses, shouldn't these objects be > declared as atomic? No. They aren't concurrent accesses to the same variable. They are concurrent accesses to *different* memory locations, and the compiler is not allowed to me

Re: [PATCH v2 6/7] Alpha: Add option to avoid data races for sub-longword memory stores [PR117759]

2025-01-06 Thread Paul E. McKenney
On Mon, Jan 06, 2025 at 05:12:57PM -0700, Jeff Law wrote: > > > On 1/6/25 6:03 AM, Maciej W. Rozycki wrote: > > With non-BWX Alpha implementations we have a problem of data races where > > a 8-bit byte or 16-bit word quantity is to be written to memory in that > > in those cases we use an unprote

Re: [PATCH 1/2] testsuite: add testcase for fixed PR117546

2025-01-06 Thread Jeff Law
On 1/3/25 11:11 AM, Sam James wrote: Sam James writes: PR117546 was fixed by Eric's r14-10693-gadab597af288d6 change, but the testcase here is sufficiently different to be worth including in torture/. gcc/testsuite/ChangeLog: PR ipa/117546 * gcc.dg/torture/pr117546.c: New t

Re: [PATCH v2 6/7] Alpha: Add option to avoid data races for sub-longword memory stores [PR117759]

2025-01-06 Thread Jeff Law
On 1/6/25 6:03 AM, Maciej W. Rozycki wrote: With non-BWX Alpha implementations we have a problem of data races where a 8-bit byte or 16-bit word quantity is to be written to memory in that in those cases we use an unprotected RMW access of a 32-bit longword or 64-bit quadword width. If conten

Re: [PATCH] testsuite: generalized field-merge tests for <32-bit int [PR118025]

2025-01-06 Thread Mike Stump
On Jan 6, 2025, at 3:05 PM, Alexandre Oliva wrote: > > On Dec 22, 2024, Alexandre Oliva wrote: > >> for gcc/testsuite/ChangeLog > >> PR testsuite/118025 >> * gcc.dg/field-merge-1.c: Convert constants to desired types. >> * gcc.dg/field-merge-3.c: Likewise. >> * gcc.dg/fiel

[PATCH] c-pretty-print.cc (pp_c_tree_decl_identifier): Strip private name encoding, PR118303

2025-01-06 Thread Hans-Peter Nilsson
Regtested native x86_64-linux. Also tested mmix-knuth-mmixware, where it fixes ONE testcase, but one which is a regression on master. The PR component is currently ipa, changed from the original middle-end. IIUC this bug-fix doesn't fit the ipa category IMHO, but rather more general tree-opt

Re: [PATCH] testsuite: generalized field-merge tests for <32-bit int [PR118025]

2025-01-06 Thread Alexandre Oliva
On Dec 22, 2024, Alexandre Oliva wrote: > for gcc/testsuite/ChangeLog > PR testsuite/118025 > * gcc.dg/field-merge-1.c: Convert constants to desired types. > * gcc.dg/field-merge-3.c: Likewise. > * gcc.dg/field-merge-4.c: Likewise. > * gcc.dg/field-merge-5.c: Likew

Re: [PATCH] ifcombine field-merge: improve handling of dwords

2025-01-06 Thread Alexandre Oliva
On Dec 21, 2024, Alexandre Oliva wrote: > On Dec 20, 2024, Jakub Jelinek wrote: >> On Wed, Dec 18, 2024 at 12:59:11AM -0300, Alexandre Oliva wrote: >>> * gcc.dg/field-merge-16.c: New. >> Note the test FAILs on i686-linux or on x86_64-linux with -m32. > Indeed, thanks. Here's a fix. Ping? htt

Re: [PATCH] c: do not warn about truncating NUL char when initializing nonstring arrays [PR117178]

2025-01-06 Thread Marek Polacek
On Sun, Dec 15, 2024 at 08:02:57PM -0800, Kees Cook wrote: > When initializing a nonstring char array when compiled with > -Wunterminated-string-initialization the warning trips even when > truncating the trailing NUL character from the string constant. Only > warn about this when running under -Wc

Re: [PATCH v3] aarch64: remove extra XTN in vector concatenation

2025-01-06 Thread Richard Sandiford
Akram Ahmad writes: > Hi Richard, > > Thanks for the feedback. I've copied in the resulting patch here- if > this is okay, please could it be committed on my behalf? The patch > continues below. > > Many thanks, > > Akram Thanks. LGTM. Pushed to trunk. Richard > --- > > GIMPLE code which perfo

Re: [PATCH v2 3/7] Alpha: Fix a block move pessimisation with zero-extension after LDWU

2025-01-06 Thread Jeff Law
On 1/6/25 6:03 AM, Maciej W. Rozycki wrote: For the BWX case we have a pessimisation in `alpha_expand_block_move' for HImode loads where we place the data loaded into a HImode register as well, therefore losing information that indeed the data loaded has already been zero-extended to the full

Re: [PATCH v2 2/7] Alpha: Optimize block moves coming from longword-aligned source

2025-01-06 Thread Jeff Law
On 1/6/25 6:03 AM, Maciej W. Rozycki wrote: Now that we have proper alignment determination for block moves in place the case of copying a block of longword-aligned data has become real, so implement the merging of loaded data from pairs of SImode registers into single DImode registers for the

Re: [PATCH v2 1/7] Alpha: Always respect -mbwx, -mcix, -mfix, -mmax, and their inverse

2025-01-06 Thread Jeff Law
On 1/6/25 6:03 AM, Maciej W. Rozycki wrote: Contrary to user documentation the `-mbwx', `-mcix', `-mfix', `-mmax' feature options and their inverse forms are ignored whenever `-mcpu=' option is in effect, either by having been given explicitly or where configured as the default such as with th

Re: [PATCH v2 4/7] Alpha: Export `emit_unlikely_jump' for a subsequent change to use

2025-01-06 Thread Jeff Law
On 1/6/25 6:03 AM, Maciej W. Rozycki wrote: Rename `emit_unlikely_jump' function to `alpha_emit_unlikely_jump', so as to avoid namespace pollution, updating callers accordingly and export it for use in the machine description. Make it return the insn emitted. gcc/ * config/al

Re: [PATCH] c: Restore warning for incomplete structures declared in parameter list [PR117866]

2025-01-06 Thread Joseph Myers
On Mon, 6 Jan 2025, Martin Uecker wrote: > > Happy new year! Please consider the following patch. > > Bootstrapped and regression tested on x86_64. > > > c: Restore warning for incomplete structures declared in parameter list > [PR117866] > > In C23 mode the warning about declari

Re: [PATCH] COBOL 1/8 hdr: header files

2025-01-06 Thread Joseph Myers
On Sat, 4 Jan 2025, James K. Lowden wrote: > On Fri, 3 Jan 2025 19:46:38 +0100 > Jakub Jelinek wrote: > > > Again, the question is if it needs to be supported everywhere, or > > just error out on targets which don't have _Float128 > > Our preference is simply to error out on targets that don't

libgo patch committed: fix Config.Time in tests with expired certificates

2025-01-06 Thread Ian Lance Taylor
PR 118286 points out that some libgo tests are starting to fail because they use test certificates that expired on January 1. This libgo patch is a backport of https://go.dev/cl/640237 in the main repo. It uses the existing config.Time field to avoid these test failures. Bootstrapped and ran Go t

RE: [RFC][PATCH] AArch64: Remove AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS

2025-01-06 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Monday, January 6, 2025 5:54 PM > To: Jennifer Schmitz > Cc: Richard Biener ; Richard Biener > ; Tamar Christina ; > gcc-patches@gcc.gnu.org; Kyrylo Tkachov > Subject: Re: [RFC][PATCH] AArch64: Remove > AARCH64_EXTRA_TUNE_USE_NEW_VEC

Re: [RFC][PATCH] AArch64: Remove AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS

2025-01-06 Thread Richard Sandiford
Jennifer Schmitz writes: >> It would also be good to check for performance regressions, now that we have >> a patch to test: >> I will run SPEC2017 with -mcpu=generic and -mcpu=native on Grace, but we >> would appreciate help with benchmarking on other platforms. >> Tamar, would you still be wil

[PATCH] Accept commas between clauses in OpenMP declare variant

2025-01-06 Thread Paul-Antoine Arras
Add support to the Fortran parser for the new OpenMP syntax that allows a comma after the directive name and between clauses of declare variant. The C and C++ parsers already support this syntax so only a new test is added. gcc/fortran/ChangeLog: * openmp.cc (gfc_match_omp_declare_variant

Re: [PATCH v2 5/7] IRA+LRA: Let the backend request to split basic blocks

2025-01-06 Thread Richard Sandiford
"Maciej W. Rozycki" writes: > The next change for Alpha will produce extra labels and branches in > reload, which in turn requires basic blocks to be split at completion. > We do this already for functions that can trap, so just extend the > arrangement with a flag for the backend to use whenev

Re: [PATCH] or1k: add .note.GNU-stack section on linux

2025-01-06 Thread Stafford Horne
On Mon, Jan 06, 2025 at 10:03:50AM -0700, Jeff Law wrote: > > > On 1/6/25 10:02 AM, Stafford Horne wrote: > > On Mon, Jan 06, 2025 at 07:37:56AM -0700, Jeff Law wrote: > > > > > > > > > On 1/6/25 6:01 AM, Stafford Horne wrote: > > > > In the OpenRISC build we get the following warning: > > > >

[PATCH v3] aarch64: remove extra XTN in vector concatenation

2025-01-06 Thread Akram Ahmad
Hi Richard, Thanks for the feedback. I've copied in the resulting patch here- if this is okay, please could it be committed on my behalf? The patch continues below. Many thanks, Akram --- GIMPLE code which performs a narrowing truncation on the result of a vector concatenation currently result

Re: [Ping, Fortran, Patch, PR114612, v1] Fix missing deep-copy for allocatable components of derived types having cycles.

2025-01-06 Thread Jerry D
On 1/6/25 2:08 AM, Andre Vehreschild wrote: Hi all, attached patch has been rebased to latest trunk. Just pinging! Regtests ok on x86_64-pc-linux-gnu / F41. Ok for mainline? - Andre On Fri, 13 Dec 2024 12:10:58 +0100 Andre Vehreschild wrote: Hi all, attached patch fixes deep-copying (or r

Re: [-Ping-, Fortran, Patch, PR116669, v3] Fix ICE in deallocation of derived types having cyclic dependencies

2025-01-06 Thread Jerry D
On 1/6/25 6:21 AM, Andre Vehreschild wrote: Hi all, during looking for something completely different, I figured, that gcc does not use std::set internally, but its implementation of hash_set. I therefore adapted the patch to use it. Nothing more changed. Still regtests ok on x86_64-pc-linux-gn

Re: [Ping, Fortran, Patch, PR116669, v1] Fix ICE in deallocation of derived types having cyclic dependencies

2025-01-06 Thread Jerry D
On 1/6/25 2:06 AM, Andre Vehreschild wrote: Hi all, pinging attached rebased patch. Regtests ok on x86_64-pc-linux-gnu / F41. Ok for mainline? - Andre On Thu, 12 Dec 2024 14:50:13 +0100 Andre Vehreschild wrote: Hi all, attached patch improves analysis of cycles in derived types, i.e. type

Re: [PATCH] or1k: add .note.GNU-stack section on linux

2025-01-06 Thread Jeff Law
On 1/6/25 10:02 AM, Stafford Horne wrote: On Mon, Jan 06, 2025 at 07:37:56AM -0700, Jeff Law wrote: On 1/6/25 6:01 AM, Stafford Horne wrote: In the OpenRISC build we get the following warning: ld: warning: __modsi3_s.o: missing .note.GNU-stack section implies executable stack

Re: [PATCH] or1k: add .note.GNU-stack section on linux

2025-01-06 Thread Stafford Horne
On Mon, Jan 06, 2025 at 07:37:56AM -0700, Jeff Law wrote: > > > On 1/6/25 6:01 AM, Stafford Horne wrote: > > In the OpenRISC build we get the following warning: > > > > ld: warning: __modsi3_s.o: missing .note.GNU-stack section implies > > executable stack > > ld: NOTE: This behaviour

Re: [ping][PATCH] testsuite/118127: Pass fortran tests on ppc64le for IEEE128 long doubles

2025-01-06 Thread Jakub Jelinek
On Mon, Jan 06, 2025 at 11:01:18AM -0500, Siddhesh Poyarekar wrote: > Ping! > > On 2024-12-19 08:16, Siddhesh Poyarekar wrote: > > Denormal behaviour is well defined for IEEE128 long doubles, so don't > > XFAIL some gfortran tests on ppc64le when configured with the IEEE128 > > long double ABI. >

Re: [RFC/RFA] [PR tree-optimization/92539] Improve code and avoid Warray-bounds false positive

2025-01-06 Thread Qing Zhao
> On Jan 6, 2025, at 11:01, Richard Biener wrote: > > On Mon, Jan 6, 2025 at 3:43 PM Qing Zhao wrote: >> >> >> >>> On Jan 6, 2025, at 09:21, Jeff Law wrote: >>> >>> >>> >>> On 1/6/25 7:11 AM, Qing Zhao wrote: > > Given it doesn't cause user visible UB, we could insert the trap

Re: [RFC/RFA] [PR tree-optimization/92539] Improve code and avoid Warray-bounds false positive

2025-01-06 Thread Jeff Law
On 1/6/25 9:01 AM, Richard Biener wrote: Note unrolling doesn't introduce UB - it makes conditional UB "obvious". That's fair and it's how I often view these kinds of things when they pop out via jump threading. So unless the condition guarding the UB unrolling exposes is visibly false

Re: [PATCH] Only apply adjust_args in OpenMP dispatch if variant substitution occurs

2025-01-06 Thread Paul-Antoine Arras
Apologies, I forgot to add the testcase. Please find attached an updated patch. On 06/01/2025 17:12, Paul-Antoine Arras wrote: This is a followup to 084ea8ad584 OpenMP: middle-end support for dispatch + adjust_args. This patch fixes a bug that caused arguments in an OpenMP dispatch call to be

[PATCH] Only apply adjust_args in OpenMP dispatch if variant substitution occurs

2025-01-06 Thread Paul-Antoine Arras
This is a followup to 084ea8ad584 OpenMP: middle-end support for dispatch + adjust_args. This patch fixes a bug that caused arguments in an OpenMP dispatch call to be modified even when no variant substitution occurred. gcc/ChangeLog: * gimplify.cc (gimplify_call_expr): Create variable

Re: [RFC/RFA] [PR tree-optimization/92539] Improve code and avoid Warray-bounds false positive

2025-01-06 Thread Richard Biener
On Mon, Jan 6, 2025 at 3:43 PM Qing Zhao wrote: > > > > > On Jan 6, 2025, at 09:21, Jeff Law wrote: > > > > > > > > On 1/6/25 7:11 AM, Qing Zhao wrote: > >>> > >>> Given it doesn't cause user visible UB, we could insert the trap *before* > >>> the UB inducing statement. That would then make the

[ping][PATCH] testsuite/118127: Pass fortran tests on ppc64le for IEEE128 long doubles

2025-01-06 Thread Siddhesh Poyarekar
Ping! On 2024-12-19 08:16, Siddhesh Poyarekar wrote: Denormal behaviour is well defined for IEEE128 long doubles, so don't XFAIL some gfortran tests on ppc64le when configured with the IEEE128 long double ABI. gcc/testsuite/ChangeLog: PR testsuite/118127 * gfortran.dg/default_f

Re: [PATCH v2 2/3] cfgexpand: Rewrite add_scope_conflicts_2 to use cache and look back further [PR111422]

2025-01-06 Thread Richard Biener
On Tue, Dec 31, 2024 at 2:04 PM Tamar Christina wrote: > > > -Original Message- > > From: Richard Biener > > Sent: Wednesday, November 20, 2024 11:28 AM > > To: Andrew Pinski > > Cc: gcc-patches@gcc.gnu.org > > Subject: Re: [PATCH v2 2/3] cfgexpand: Rewrite add_scope_conflicts_2 to use >

Re: [PATCH] arm: [MVE intrinsics] Another fix for moves of tuples (PR target/118131)

2025-01-06 Thread Christophe Lyon
ping? On Fri, 20 Dec 2024 at 23:53, Christophe Lyon wrote: > > Commit r15-6389-g670df03e5294a3 only partially fixed support for moves > of large modes: despite the introduction of V2x* and V4x* modes in > r15-6245-g4f4e13dd235b to support MVE tuples, we still need to support > TI, OI and XI modes

[PATCH] Do not call cp_parser_omp_dispatch directly in cp_parser_pragma

2025-01-06 Thread Paul-Antoine Arras
This is a followup to ed49709acda OpenMP: C++ front-end support for dispatch + adjust_args. The call to cp_parser_omp_dispatch only belongs in cp_parser_omp_construct. In cp_parser_pragma, handle PRAGMA_OMP_DISPATCH by calling cp_parser_omp_construct. gcc/cp/ChangeLog: * parser.cc (cp_pa

Re: [PATCH v2 1/4] testsuite: RISC-V: Add effective target for E ABI variant

2025-01-06 Thread Jeff Law
On 1/4/25 11:01 AM, Dimitar Dimitrov wrote: Add new effective target check for either ILP32E or ILP64E ABI variants. Initial implementation only checks for RV32E or RV64E ISA, which in turn implies that ILP32E/ILP64E ABI is used. The RV32I+ILP32E and RV64I+ILP64E combinations are not yet cau

[PATCH] c: Restore warning for incomplete structures declared in parameter list [PR117866]

2025-01-06 Thread Martin Uecker
Happy new year! Please consider the following patch. Bootstrapped and regression tested on x86_64. c: Restore warning for incomplete structures declared in parameter list [PR117866] In C23 mode the warning about declaring structures and union in parameter lists was removed, b

Re: [RFC/RFA] [PR tree-optimization/92539] Improve code and avoid Warray-bounds false positive

2025-01-06 Thread Qing Zhao
> On Jan 6, 2025, at 09:21, Jeff Law wrote: > > > > On 1/6/25 7:11 AM, Qing Zhao wrote: >>> >>> Given it doesn't cause user visible UB, we could insert the trap *before* >>> the UB inducing statement. That would then make the statement unreachable >>> and it'd get removed avoiding the fal

Re: [PATCH 2/2] Alpha: Restore frame pointer last in `builtin_longjmp' [PR64242]

2025-01-06 Thread Jeff Law
On 1/5/25 9:40 AM, Maciej W. Rozycki wrote: Add similar arrangements to `builtin_longjmp' for Alpha as with commit 71b144289c1c ("re PR middle-end/64242 (Longjmp expansion incorrect)") and commit 511ed59d0b04 ("Fix PR64242 - Longjmp expansion incorrect"), so as to restore the frame pointer las

Re: [PATCH 1/2] Alpha: Add memory clobbers to `builtin_longjmp' expansion

2025-01-06 Thread Jeff Law
On 1/5/25 9:40 AM, Maciej W. Rozycki wrote: Add the same memory clobbers to `builtin_longjmp' for Alpha as with commit 41439bf6a647 ("builtins.c (expand_builtin_longjmp): Added two memory clobbers."), to prevent instructions that access memory via the frame or stack pointer from being moved ac

Re: [PATCH] or1k: add .note.GNU-stack section on linux

2025-01-06 Thread Jeff Law
On 1/6/25 6:01 AM, Stafford Horne wrote: In the OpenRISC build we get the following warning: ld: warning: __modsi3_s.o: missing .note.GNU-stack section implies executable stack ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker Fix this b

Re: [RFC/RFA] [PR tree-optimization/92539] Improve code and avoid Warray-bounds false positive

2025-01-06 Thread Jeff Law
On 1/6/25 7:11 AM, Qing Zhao wrote: Given it doesn't cause user visible UB, we could insert the trap *before* the UB inducing statement. That would then make the statement unreachable and it'd get removed avoiding the false positive diagnostic. Yes, that’s a good idea. However, in order

Re: [-Ping-, Fortran, Patch, PR116669, v3] Fix ICE in deallocation of derived types having cyclic dependencies

2025-01-06 Thread Andre Vehreschild
Hi all, during looking for something completely different, I figured, that gcc does not use std::set internally, but its implementation of hash_set. I therefore adapted the patch to use it. Nothing more changed. Still regtests ok on x86_64-pc-linux-gnu / F41. Ok for mainline? Regards, An

Re: [RFC][PATCH] AArch64: Remove AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS

2025-01-06 Thread Jennifer Schmitz
> On 19 Dec 2024, at 14:10, Jennifer Schmitz wrote: > > > >> On 19 Dec 2024, at 11:14, Richard Sandiford >> wrote: >> >> External email: Use caution opening links or attachments >> >> >> Jennifer Schmitz writes: >>> @@ -8834,22 +8834,7 @@ vectorizable_store (vec_info *vinfo, >>>

Re: [RFC/RFA] [PR tree-optimization/92539] Improve code and avoid Warray-bounds false positive

2025-01-06 Thread Qing Zhao
> On Jan 4, 2025, at 12:58, Jeff Law wrote: > > > > On 1/3/25 10:30 AM, Qing Zhao wrote: >>> On Jan 3, 2025, at 11:41, Richard Biener wrote: >>> >>> >>> Am 03.01.2025 um 16:22 schrieb Jeff Law : So this is an implementation of an idea I had a few years back and prot

[PATCH] rtl: Remove invalid compare simplification [PR117186]

2025-01-06 Thread Richard Sandiford
g:d882fe5150fbbeb4e44d007bb4964e5b22373021, posted at https://gcc.gnu.org/pipermail/gcc-patches/2000-July/033786.html , added code to treat: (set (reg:CC cc) (compare:CC (gt:M (reg:CC cc) 0) (lt:M (reg:CC cc) 0))) as a nop. This PR shows that that isn't always correct. The compare in the set a

Re: [PATCH] Alpha: Always respect -mbwx, -mcix, -mfix, -mmax, and their inverse

2025-01-06 Thread Maciej W. Rozycki
On Mon, 30 Dec 2024, Maciej W. Rozycki wrote: > > Contrary to user documentation the `-mbwx', `-mcix', `-mfix', `-mmax' > > feature options and their inverse forms are ignored whenever `-mcpu=' > > option is in effect, either by having been given explicitly or where > > configured as the defaul

[PATCH v2 5/7] IRA+LRA: Let the backend request to split basic blocks

2025-01-06 Thread Maciej W. Rozycki
The next change for Alpha will produce extra labels and branches in reload, which in turn requires basic blocks to be split at completion. We do this already for functions that can trap, so just extend the arrangement with a flag for the backend to use whenever it finds it necessary. g

[PATCH v2 4/7] Alpha: Export `emit_unlikely_jump' for a subsequent change to use

2025-01-06 Thread Maciej W. Rozycki
Rename `emit_unlikely_jump' function to `alpha_emit_unlikely_jump', so as to avoid namespace pollution, updating callers accordingly and export it for use in the machine description. Make it return the insn emitted. gcc/ * config/alpha/alpha-protos.h (alpha_emit_unlikely_jump):

[PATCH v2 3/7] Alpha: Fix a block move pessimisation with zero-extension after LDWU

2025-01-06 Thread Maciej W. Rozycki
For the BWX case we have a pessimisation in `alpha_expand_block_move' for HImode loads where we place the data loaded into a HImode register as well, therefore losing information that indeed the data loaded has already been zero-extended to the full DImode width of the register. Later on when

[PATCH v2 7/7] Alpha: Add option to avoid data races for partial writes [PR117759]

2025-01-06 Thread Maciej W. Rozycki
Similarly to data races with 8-bit byte or 16-bit word quantity memory writes on non-BWX Alpha implementations we have the same problem even on BWX implementations with partial memory writes produced for unaligned stores as well as block memory move and clear operations. This happens at the bo

[PATCH v2 6/7] Alpha: Add option to avoid data races for sub-longword memory stores [PR117759]

2025-01-06 Thread Maciej W. Rozycki
With non-BWX Alpha implementations we have a problem of data races where a 8-bit byte or 16-bit word quantity is to be written to memory in that in those cases we use an unprotected RMW access of a 32-bit longword or 64-bit quadword width. If contents of the longword or quadword accessed outsi

[PATCH v2 2/7] Alpha: Optimize block moves coming from longword-aligned source

2025-01-06 Thread Maciej W. Rozycki
Now that we have proper alignment determination for block moves in place the case of copying a block of longword-aligned data has become real, so implement the merging of loaded data from pairs of SImode registers into single DImode registers for the purpose of using with unaligned stores effic

[PATCH v2 1/7] Alpha: Always respect -mbwx, -mcix, -mfix, -mmax, and their inverse

2025-01-06 Thread Maciej W. Rozycki
Contrary to user documentation the `-mbwx', `-mcix', `-mfix', `-mmax' feature options and their inverse forms are ignored whenever `-mcpu=' option is in effect, either by having been given explicitly or where configured as the default such as with the `alphaev56-linux-gnu' target. In the latte

[PATCH v2 0/7] Fix data races with sub-longword accesses on Alpha

2025-01-06 Thread Maciej W. Rozycki
Hi, This is v2 of the series updated according to the outcome from testing with a BWX system. Only test cases have been updated and no changes have been made to actual code and obviously only the patches still outstanding have been included. Additionally 1/7 has been folded into this series

[PATCH] or1k: add .note.GNU-stack section on linux

2025-01-06 Thread Stafford Horne
In the OpenRISC build we get the following warning: ld: warning: __modsi3_s.o: missing .note.GNU-stack section implies executable stack ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker Fix this by adding a .note.GNU-stack to indicate the stack

Re: [PATCH v2] Replace uptr by usize/SIZE_T in interfaces

2025-01-06 Thread Jakub Jelinek
On Mon, Jan 06, 2025 at 11:59:08AM +0100, Stefan Schulze Frielinghaus wrote: > For some targets uptr is mapped to unsigned int and size_t to unsigned > long and sizeof(int)==sizeof(long) holds. Still, these are distinct > types and type checking may fail. Therefore, replace uptr by > usize/SIZE_T

[PATCH v3 5/6] c++/modules: Add testcase for fixed ICE [PR116568]

2025-01-06 Thread Nathaniel Shead
This ICE was fixed by ensuring that the lambdas had LAMBDA_EXPR_EXTRA_SCOPE properly set. PR c++/116568 gcc/testsuite/ChangeLog: * g++.dg/modules/lambda-8.h: New test. * g++.dg/modules/lambda-8_a.H: New test. * g++.dg/modules/lambda-8_b.C: New test. Signed-off-by

[PATCH v3 6/6] c++/modules: Diagnose TU-local lambdas, give mangling scope to lambdas in concepts

2025-01-06 Thread Nathaniel Shead
Happy to defer this till GCC16 if preferred. -- >8 -- This fills in a hole left in r15-6378-g9016c5ac94c557 with regards to detection of TU-local lambdas. Now that LAMBDA_EXPR_EXTRA_SCOPE is properly set for most lambdas we can use it to detect lambdas that are TU-local. Lambdas in concept defi

[PATCH v3 4/6] c++: Update mangling of lambdas in expressions

2025-01-06 Thread Nathaniel Shead
https://github.com/itanium-cxx-abi/cxx-abi/pull/85 clarifies that mangling a lambda expression should use 'L' rather than "tl". This only affects C++20 (and later) so no ABI flag is given. gcc/cp/ChangeLog: * mangle.cc (write_expression): Update mangling for lambdas. gcc/testsuite/Chang

[PATCH v3 3/6] c++: Fix ABI for lambdas declared in alias templates [PR116568]

2025-01-06 Thread Nathaniel Shead
I'm not 100% sure I've handled this properly, any feedback welcome. In particular, maybe should I use `DECL_IMPLICIT_TYPEDEF_P` in the mangling logic instead of `!TYPE_DECL_ALIAS_P`? They both seem to work in this case but not sure which would be clearer. I also looked into trying do a limited fo

[PATCH v3 2/6] c++: Fix mangling of otherwise unattached class-scope lambdas [PR118245]

2025-01-06 Thread Nathaniel Shead
Something like this should probably be backported to GCC 14 too, since my change in r14-9232-g3685fae23bb008 inadvertantly caused ICEs that this fixes. But without the previous patch this patch will cause ABI changes, and I'm not sure how easily it would be to divorce those changes from the fix he

[PATCH v3 1/6] c++: Fix mangling of lambdas in static data member initializers [PR107741]

2025-01-06 Thread Nathaniel Shead
This fixes an issue where lambdas declared in the initializer of a static data member within the class body do not get a mangling scope of that variable; this results in mangled names that do not conform to the ABI spec. To do this, the patch splits up grokfield for this case specifically, allowin

[PATCH v3 0/6] c++: Add some missing LAMBDA_EXPR_EXTRA_SCOPEs

2025-01-06 Thread Nathaniel Shead
This patch series fixes some ABI issues in lambdas, with a side effect of fixing some issues with module streaming. This doesn't completely fix the ABI for lambdas (in particular, namespace scope aliases are still broken) but it at least improves the situation. Successfully bootstrapped and regte

[PATCH 3/4] vect: Ensure profile consistency when adding epilog guard [PR117790]

2025-01-06 Thread Alex Coplan
This patch tries to make the CFG profile consistent when adding a guard edge to skip the epilog during peeling. The changes can be summarized as follows: - We avoid adding the guard edge entirely if the guard condition folds to false, otherwise the profile will become inconsistent since the

[PATCH 4/4] vect: Fix scale_profile_for_vect_loop for multiple exits [PR117790]

2025-01-06 Thread Alex Coplan
This adjusts scale_profile_for_vect_loop to DTRT for loops with multiple exits, namely using scale_loop_profile_hold_exit_counts instead and scaling the expected niters by 1 / VF. Tested as a series on aarch64-linux-gnu, arm-linux-gnueabihf, and x86_64-linux-gnu. OK for trunk? Thanks, Alex gcc/

[PATCH 2/4] cfgloopmanip: Add infrastructure for scaling of multi-exit loops [PR117790]

2025-01-06 Thread Alex Coplan
As it stands, scale_loop_profile doesn't correctly handle loops with multiple exits. In particular, in the case where the expected niters exceeds iteration_bound, scale_loop_profile attempts to reduce the number of iterations with a call to scale_loop_frequencies, which multiplies the count of eac

[PATCH 1/4] vect: Set counts of early break exit blocks correctly [PR117790]

2025-01-06 Thread Alex Coplan
This adds missing code to correctly set the counts of the exit blocks we create when building the CFG for a vectorized early break loop. Tested as a series on aarch64-linux-gnu, arm-linux-gnueabihf, and x86_64-linux-gnu. OK for trunk? Thanks, Alex gcc/ChangeLog: PR tree-optimization/11

[PATCH 0/4] vect, cfgloopmanip: Fix profile consistency of early break loops [PR117790]

2025-01-06 Thread Alex Coplan
This patch series aims to fix the consistency of the CFG profile for vectorized early break loops. I.e., if the CFG profile entering the vectorizer is consistent for a given early break loop, this series aims to ensure that the final vectorized loop also has a consistent profile. For SPEC CPU 201

[PING][PATCH v2] Add new hardreg PRE pass

2025-01-06 Thread Andrew Carlotti
On Tue, Dec 17, 2024 at 11:53:24AM +, Andrew Carlotti wrote: > This pass is used to optimise assignments to the FPMR register in > aarch64. I chose to implement this as a middle-end pass because it > mostly reuses the existing RTL PRE code within gcse.cc. > > Compared to RTL PRE, the key diff

[Ada] Fix PR ada/118247

2025-01-06 Thread Eric Botcazou
This is a regression introduced by https://gcc.gnu.org/pipermail/gcc-cvs/2024-July/405522.html in the form of a spurious relinking of the gnatbind executable for the install target of cross Ada compilers. Tested on x86-64/Linux, applied on the mainline. 2025-01-06 Eric Botcazou PR

[PATCH v2] Replace uptr by usize/SIZE_T in interfaces

2025-01-06 Thread Stefan Schulze Frielinghaus
For some targets uptr is mapped to unsigned int and size_t to unsigned long and sizeof(int)==sizeof(long) holds. Still, these are distinct types and type checking may fail. Therefore, replace uptr by usize/SIZE_T wherever a size_t is expected. Part of #116957 Cherry picked from LLVM commit 9a15

Re: [PATCH] [sanitizer] Replace uptr by usize/SIZE_T in interfaces

2025-01-06 Thread Stefan Schulze Frielinghaus
On Mon, Jan 06, 2025 at 10:49:31AM +0100, Jakub Jelinek wrote: > On Mon, Jan 06, 2025 at 09:55:16AM +0100, Stefan Schulze Frielinghaus wrote: > > For some targets uptr is mapped to unsigned int and size_t to unsigned > > long and sizeof(int)==sizeof(long) holds. Still, these are distinct > > types

A wide range of pillow block bearing units from winstarbearing/Wendy

2025-01-06 Thread Wendy-CTS bearing
Good day, dear: Winstar bearings offer a wide range of pillow block bearing units from metallic, non-metallic and hybrid material that will suit your specific application. Our pillow block bearing unit: Pillow block bearing (UCP, UCPH, UCF, UCFL, UKP, UCFC, SBLF…) Pillow block insert bearin

Re: [Ping, Fortran, Patch, PR114612, v1] Fix missing deep-copy for allocatable components of derived types having cycles.

2025-01-06 Thread Andre Vehreschild
Hi all, attached patch has been rebased to latest trunk. Just pinging! Regtests ok on x86_64-pc-linux-gnu / F41. Ok for mainline? - Andre On Fri, 13 Dec 2024 12:10:58 +0100 Andre Vehreschild wrote: > Hi all, > > attached patch fixes deep-copying (or rather its former absence) for > allocatabl

Re: [Ping, Fortran, Patch, PR116669, v1] Fix ICE in deallocation of derived types having cyclic dependencies

2025-01-06 Thread Andre Vehreschild
Hi all, pinging attached rebased patch. Regtests ok on x86_64-pc-linux-gnu / F41. Ok for mainline? - Andre On Thu, 12 Dec 2024 14:50:13 +0100 Andre Vehreschild wrote: > Hi all, > > attached patch improves analysis of cycles in derived types, i.e. type > dependencies ala: > > type(T) > type(

[PATCH v4] RISC-V: Fix code gen for reduction with length 0 [PR118182]

2025-01-06 Thread Kito Cheng
`.MASK_LEN_FOLD_LEFT_PLUS`(or `mask_len_fold_left_plus_m`) is expecting the return value will be the start value even if the length is 0. However current code gen in RISC-V backend is not meet that semantic, it will result a random garbage value if length is 0. Let example by current code gen for

Re: [PATCH v3] RISC-V: Fix code gen for reduction with length 0 [PR118182]

2025-01-06 Thread Kito Cheng
I have a few more thoughts during the vacation: we don't really need those fixes (work around for VL=0) *IF* we know the VL is constant or VLMAX, and that the situation for out-loop reduction, the only thing we need to fix is the in-loop reduction, that should be the only case will affected. On Fr

Re: [PATCH] [sanitizer] Fix few size types in memprof (#119114)

2025-01-06 Thread Jakub Jelinek
On Mon, Jan 06, 2025 at 09:55:18AM +0100, Stefan Schulze Frielinghaus wrote: > From: Vitaly Buka > > Fix type in a few related Min() calls. > > Follow up to #116957. > > Co-authored-by: Stefan Schulze Frielinghaus > > Cherry picked from LLVM commit 6dec33834d1fd89f16e271dde9607c1de9554144 > (

Re: [PATCH] [sanitizer] Replace uptr by usize/SIZE_T in interfaces

2025-01-06 Thread Jakub Jelinek
On Mon, Jan 06, 2025 at 09:55:16AM +0100, Stefan Schulze Frielinghaus wrote: > For some targets uptr is mapped to unsigned int and size_t to unsigned > long and sizeof(int)==sizeof(long) holds. Still, these are distinct > types and type checking may fail. Therefore, replace uptr by > usize/SIZE_T

[COMMITTED 30/30] ada: Fix small thinko in previous change to two-pass aggregate expansion

2025-01-06 Thread Marc Poulhiès
From: Eric Botcazou We need a type tailored to the base index type to compute the length. gcc/ada/ChangeLog: * exp_aggr.adb (Two_Pass_Aggregate_Expansion): Use the base type of the index type to find the type used to compute the length. Tested on x86_64-pc-linux-gnu, committed

[COMMITTED 28/30] ada: Fix predicate involving array indexing rejected in generic package

2025-01-06 Thread Marc Poulhiès
From: Eric Botcazou The indexing is rejected with the message: error: reference to current instance of type does not denote a type when it is applied to a prefix which is the current instance of the type to which the predicate is applied. There is already a specific handling of component sel

[COMMITTED 24/30] ada: Correct xref of operator expression function body

2025-01-06 Thread Marc Poulhiès
From: Bob Duff For an expression function body that is an operator, make sure the xref entry in the ALI file points one past the double quote mark. For example, if the name is ">", point to the greater-than symbol, not the double quote. This was already the case for proper bodies. gcc/ada/Change

[COMMITTED 16/30] ada: Plug small loophole in previous change

2025-01-06 Thread Marc Poulhiès
From: Eric Botcazou The initial change only deals with the controlled record case for assignment statements, but the controlled array case needs the same treatment. gcc/ada/ChangeLog: * exp_ch5.adb (Expand_Assign_Array): Bail out for controlled components if the RHS is a functio

[COMMITTED 15/30] ada: Fix printing boolean attributes in the SARIF report

2025-01-06 Thread Marc Poulhiès
From: Viljar Indus Boolean attributes should have the value true or false without any quotes. gcc/ada/ChangeLog: * diagnostics-json_utils.adb: Add new method Write_Boolean_Attribute. * diagnostics-json_utils.ads: Likewise. * diagnostics-sarif_emitter.adb (Print_I

Re: [PATCH] [sanitizer] Add type __sanitizer::ssize (#116957)

2025-01-06 Thread Jakub Jelinek
On Mon, Jan 06, 2025 at 09:55:17AM +0100, Stefan Schulze Frielinghaus wrote: > Cherry picked from LLVM commit ce44640fe29550461120d22b0358e6cac4aed822. > > PR sanitizer/117725 This line needs to be tab indented, best put right before the * interception/ line. > libsanitizer/ChangeLog: > >

Re: [PATCH] [sanitizer] Fix type in some Min() calls (#119248)

2025-01-06 Thread Jakub Jelinek
On Mon, Jan 06, 2025 at 09:55:19AM +0100, Stefan Schulze Frielinghaus wrote: > This is a follow-up to 6dec33834d1fd89f16e271dde9607c1de9554144 and > #116957 and #119114. > > Cherry picked from LLVM commit 65a2eb0b1589590ae78cc1e5f05cd004b3b3bec5. > > PR sanitizer/117725 > libsanitizer/ChangeLog:

[COMMITTED 29/30] ada: Streamline runtime support of finalization collections

2025-01-06 Thread Marc Poulhiès
From: Eric Botcazou Finalization collections are declared as (limited) controlled types so that they can be naturally attached to a finalization master, but the same result can be achieved by means of (limited) finalizable types, which need not be tagged and thus avoid dragging the runtime suppor

[COMMITTED 11/30] ada: null procedure cannot be used as compilation unit

2025-01-06 Thread Marc Poulhiès
From: Bob Duff This patch gives a syntax error if a null procedure is used as a compilation unit. The error was already given during semantic analysis; now it is given in the parser, which is more convenient for other tools like gprbuild, because the -gnats switch now gives the error. Note that

  1   2   >