[PATCH] debug/100530 - Revert QUAL_ADDR_SPACE handling from dwarf2out.cc

2025-01-31 Thread Richard Biener
The bug clearly shows that r8-4385-ga297ccb52e0c89 was wrong in enabling handling of address-space qualification as DWARF type qualifiers as the code isn't prepared to it actually be not handled and ends up changing a lesser qualified (without address-space) type DIE in ways tripping asserts. The

[PATCH] Do not rely on non-SLP analysis for SLP outer loop vectorization

2025-01-31 Thread Richard Biener
We end up relying on non-SLP analysis of the inner loop LC PHI to set the vectorizationb method for SLP since vectorizable_reduction claims responsibility. The following fixes this. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. * tree-vect-loop.cc (vect_analyze_loop_operat

[PATCH] force-indirect-call-2.c: Allow indirect branch via GOT

2025-01-31 Thread H.J. Lu
r15-1619-g3b9b8d6cfdf593 changed the codegen from f2: .cfi_startproc pushq %rbx .cfi_def_cfa_offset 16 .cfi_offset 3, -16 movqf1@GOTPCREL(%rip), %rbx call*%rbx leaqf3(%rip), %rax call*%rax movq%rbx, %rax

Re: [PATCH] force-indirect-call-2.c: Allow indirect branch via GOT

2025-01-31 Thread Uros Bizjak
On Fri, Jan 31, 2025 at 11:35 AM H.J. Lu wrote: > r15-1619-g3b9b8d6cfdf593 changed the codegen from > > f2: > .cfi_startproc > pushq %rbx > .cfi_def_cfa_offset 16 > .cfi_offset 3, -16 > movqf1@GOTPCREL(%rip), %rbx > call*%rbx > lea

[PATCH] OpenMP/Fortran: Add missing pop_state in parse_omp_dispatch

2025-01-31 Thread Paul-Antoine Arras
When the ST_NONE case is taken, the function returns immediately. Not calling pop_state causes a dangling pointer. gcc/fortran/ChangeLog: * parse.cc (parse_omp_dispatch): Add missing pop_state. --- gcc/fortran/parse.cc | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git

[PATCH] libstdc++: Use canonical loop form in std::reduce

2025-01-31 Thread Abhishek Kaushik
>From 4ac7c7e56e23ed2f4dd2dafdfab6cfa110c14260 Mon Sep 17 00:00:00 2001 From: Abhishek Kaushik Date: Fri, 31 Jan 2025 01:28:48 -0800 Subject: [PATCH] libstdc++: Use canonical loop form in std::reduce The current while loop in std::reduce and related functions is hard to vectorize because the loop

Re: [PATCH] debug/100530 - Revert QUAL_ADDR_SPACE handling from dwarf2out.cc

2025-01-31 Thread Jakub Jelinek
On Fri, Jan 31, 2025 at 09:07:52AM +0100, Richard Biener wrote: > The bug clearly shows that r8-4385-ga297ccb52e0c89 was wrong in > enabling handling of address-space qualification as DWARF type > qualifiers as the code isn't prepared to it actually be not handled > and ends up changing a lesser qu

[PATCH] x86: Handle -mindirect-branch-register for indirect calls

2025-01-31 Thread H.J. Lu
-mindirect-branch-register requires indirect call and jump via register. For -mindirect-branch-register, expanding indirect call via register and update call patterns and peepholes to disable indirect call via memory. gcc/ PR target/115673 * config/i386/i386-expand.cc (ix86_expand

Re: [PATCH 2/2] Add prime path coverage to gcc/gcov

2025-01-31 Thread Jørgen Kvalsvik
Ping. Should I apply these changes and re-submit, or would you like to see more changes? Thanks, Jørgen On 1/5/25 22:06, Jørgen Kvalsvik wrote: On 1/5/25 20:53, Jørgen Kvalsvik wrote: On 1/5/25 20:25, Jan Hubicka wrote: ALGORITHM Since the numbers of paths grows so fast, we need a good algo

[PATCH] Record, report basic blocks of conditional exprs

2025-01-31 Thread Jørgen Kvalsvik
Record basic blocks that make up a conditional expression with -fcondition-coverage and report when using the gcov -w/--verbose flag. This makes the report more accurate when basic blocks are included as there may be blocks in-between the actual Boolean expressions, e.g. when there a term is the re

[PATCH] icf: Compare call argument types in certain cases and asm operands [PR117432]

2025-01-31 Thread Jakub Jelinek
Hi! compare_operand uses operand_equal_p under the hood, which e.g. for INTEGER_CSTs will just match the values rather regardless of their types. Now, in many comparing the type is redundant, if we have x_2 = y_3 + 1; we've already compared the type for the lhs and also for rhs1, there won't be

[PATCH] niter: Make build_cltz_expr more robust [PR118689]

2025-01-31 Thread Jakub Jelinek
Hi! Since my r15-7223 the niter analysis can recognize one loop during bootstrap as being ctz like. The patch just turned @@ -2173,7 +2173,7 @@ PROC m2pim_NumberIO_BinToStr (CARDINAL x _T535_44 = &buf[i.40_2]{lb: 1 sz: 4}; _T536_45 = x_21 & 1; *_T535_44 = _T536_45; - _T537_47 = x_21 / 2;

Re: [PATCH] niter: Make build_cltz_expr more robust [PR118689]

2025-01-31 Thread Richard Biener
> Am 31.01.2025 um 10:24 schrieb Jakub Jelinek : > > Hi! > > Since my r15-7223 the niter analysis can recognize one loop during bootstrap > as being ctz like. > The patch just turned > @@ -2173,7 +2173,7 @@ PROC m2pim_NumberIO_BinToStr (CARDINAL x > _T535_44 = &buf[i.40_2]{lb: 1 sz: 4}; >

Re: [PATCH] icf: Compare call argument types in certain cases and asm operands [PR117432]

2025-01-31 Thread Jakub Jelinek
On Fri, Jan 31, 2025 at 02:19:28PM +0100, Richard Biener wrote: > > For internal calls gimple_call_fndecl (s1) will be NULL, so > > !gimple_call_fndecl (s1) will be true and so the new checks aren't done. > > Yes, but also fntype1/2 will be NULL then. > > > > if (gimple_call_internal_p (s1) (with

[PATCH v2] x86: Handle -mindirect-branch-register for -fno-plt

2025-01-31 Thread H.J. Lu
-fno-plt forces external call to indirect call via GOT memory. But -mindirect-branch-register requires indirect call and jump via register. For -mindirect-branch-register, expanding indirect call via register and update call patterns and peepholes to disable indirect call via memory. gcc/

Re: [PATCH] x86: Handle -mindirect-branch-register for indirect calls

2025-01-31 Thread H.J. Lu
On Fri, Jan 31, 2025 at 8:44 PM Uros Bizjak wrote: > > On Fri, Jan 31, 2025 at 12:09 PM H.J. Lu wrote: > > > > -mindirect-branch-register requires indirect call and jump via register. > > For -mindirect-branch-register, expanding indirect call via register and > > update call patterns and peephol

[PATCH 1/3] c++: Fix mangling of lambas in static member template initializers [PR107741]

2025-01-31 Thread Nathaniel Shead
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? -- >8 -- My fix for this issue in r15-7147 turns out to not be quite sufficient; static member templates apparently go down a different code path and need their own handling. PR c++/107741 gcc/cp/ChangeLog: * decl

[PATCH 2/3] c++: Clear lambda scope for unattached member template lambdas

2025-01-31 Thread Nathaniel Shead
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? -- >8 -- In r15-7202 we made lambdas between a template parameter scope and a class/function/initializer be considered TU-local, in lieu of working out how to mangle them to the succeeding declaration. I neglected to clear any exis

Re: [PATCH v2] x86: Handle -mindirect-branch-register for -fno-plt

2025-01-31 Thread Uros Bizjak
On Fri, Jan 31, 2025 at 2:54 PM Uros Bizjak wrote: > > On Fri, Jan 31, 2025 at 2:36 PM H.J. Lu wrote: > > > > -fno-plt forces external call to indirect call via GOT memory. But > > -mindirect-branch-register requires indirect call and jump via register. > > For -mindirect-branch-register, expand

[PATCH] aarch64: Fix dupq_* testsuite failures

2025-01-31 Thread Richard Sandiford
This patch fixes the dupq_* testsuite failures. The tests were introduced with r15-3669-ga92f54f580c3 (which was a nice improvement) and Pengxuan originally had a follow-on patch to recognise INDEX constants during vec_init. I'd originally wanted to solve this a different way, using wildcards whe

[PATCH v11] c++: Fix overeager Woverloaded-virtual with conversion operators [PR109918]

2025-01-31 Thread Simon Martin
Hi Jason, On 27 Jan 2025, at 16:49, Jason Merrill wrote: > On 1/27/25 10:41 AM, Simon Martin wrote: >> Hi Jason, >> >> On 17 Jan 2025, at 23:33, Jason Merrill wrote: >> >>> On 1/17/25 9:52 AM, Simon Martin wrote: Hi Jason, On 16 Jan 2025, at 22:49, Jason Merrill wrote: > O

[PATCH 0/61] Improve Mips target

2025-01-31 Thread Aleksandar Rakic
This patch series improves the support for the mips64r6 target in GCC, includes the enhancements to the general bug fixes and contains other MIPS ISA and processor enablement. These patches are cherry-picked from the mips_rel/11_2_0/master and mips_rel/9_3_0/master branches from the MIPS' reposito

Re: [PATCH] icf: Compare call argument types in certain cases and asm operands [PR117432]

2025-01-31 Thread Richard Biener
On Fri, 31 Jan 2025, Jakub Jelinek wrote: > On Fri, Jan 31, 2025 at 01:38:36PM +0100, Richard Biener wrote: > > > @@ -718,8 +720,11 @@ func_checker::compare_gimple_call (gcall > > > > > >/* For direct calls we verify that types are compatible so if we > > > matched > > > callees, call

[PATCH] icf, v2: Compare call argument types in certain cases and asm operands [PR117432]

2025-01-31 Thread Jakub Jelinek
On Fri, Jan 31, 2025 at 02:29:57PM +0100, Jakub Jelinek wrote: > > } > > else > > { > > tree fntype1 = gimple_call_fntype (s1); > > tree fntype2 = gimple_call_fntype (s2); > > > > if ((fntype1 && !fntype2) > > || (!fntype1 && fntype2) > > || (fntype1

Re: [PATCH] libstdc++: Use canonical loop form in std::reduce

2025-01-31 Thread Abhishek Kaushik
Sorry for the confusion, the change is for the intel compiler which is not able to vectorize correctly the while loop. I'll change the commit message to show this clearly. But it looks like the change still might be beneficial to g++: https://godbolt.org/z/Mo3PdxbaY ___

[PATCH] libstdc++: Use canonical loop form in std::reduce

2025-01-31 Thread Abhishek Kaushik
>From 7a7c9a2a976fbb29f67c46284e7c1581cbe8cb07 Mon Sep 17 00:00:00 2001 From: Abhishek Kaushik Date: Fri, 31 Jan 2025 01:28:48 -0800 Subject: [PATCH] libstdc++: Use canonical loop form in std::reduce This change is for the INTEL C compiler (icx). The current while loop in std::reduce and related

Re: [PATCH] c++: auto in trailing-return-type in parameter [PR117778]

2025-01-31 Thread Jason Merrill
On 1/30/25 5:24 PM, Marek Polacek wrote: Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/14? -- >8 -- This PR describes a few issues, both ICE and rejects-valid, but ultimately the problem is that we don't properly synthesize the second auto in: int g (auto fp() -> auto) {

Re: [PATCH v2] c++: Don't merge friend declarations that specify default arguments [PR118319]

2025-01-31 Thread Simon Martin
Hi Jason, On 31 Jan 2025, at 16:29, Jason Merrill wrote: > On 1/31/25 9:52 AM, Simon Martin wrote: >> Hi Jason, >> >> On 9 Jan 2025, at 22:55, Jason Merrill wrote: >> >>> On 1/9/25 8:25 AM, Simon Martin wrote: We segfault upon the following invalid code === cut here === templa

[PATCH 17/61] Add -munique-sections feature

2025-01-31 Thread Aleksandar Rakic
From: Matthew Fortune gcc/ * config/mips/mips.cc (mips_unique_sections_list): New global variable. (mips_read_list): Update prototype and error message. (ultimate_transparent_alias_target): New function. Copied from varasm.c. (mips_asm_unique_secti

[PATCH 15/61] Possible inlining improvements with -Os

2025-01-31 Thread Aleksandar Rakic
From: Robert Suchanek --param early-inlining-insns-cold=NUMBER --param max-inline-insns-small-and-cold=NUMBER Analysis shows that the main difference between -O2 and -Os goes down to inlining of cold or unlikely functions. The new parameters (defaulted to 0) mean to disable these limitations wit

[PATCH 29/61] Prevent FP values being spilled to GPRs

2025-01-31 Thread Aleksandar Rakic
From: Simon Dardis gcc/ * config/mips/mips.cc (mips_ira_change_pseudo_allocno_class): Prevent FP modes being reloaded to GPRs. Don't force integer mode pseudos into GR_REGS (and likewise for float mode pseudos and FP_REGS) if both the allocno class and best cost cl

[PATCH 05/61] Hazard barrier return support

2025-01-31 Thread Aleksandar Rakic
From: Chao-ying Fu gcc/ * config/mips/mips.cc (mips_use_hazard_barrier_return_p): New static function. (mips_function_attr_inlinable_p): Likewise. (mips_compute_frame_info): Set use_hazard_barrier_return_p. Emit error for unsupported architecture choice.

[PATCH 31/61] Improve aligned straight line memcpy

2025-01-31 Thread Aleksandar Rakic
From: Robert Suchanek Cherry-picked 4194c529fade9b3106d118cac63b71bc8b13f7be from https://github.com/MIPS/gcc Signed-off-by: Robert Suchanek Signed-off-by: Faraz Shahbazker Signed-off-by: Aleksandar Rakic --- gcc/config/mips/mips.cc | 8 +++- gcc/config/mips/mips.h | 5 + 2 files ch

[PATCH 22/61] Add -minline-intermix to ignore mips16/nomips16

2025-01-31 Thread Aleksandar Rakic
From: Matthew Fortune Add a CLI option and an inline_intermix function attribute to ignore ISA differences between a caller and a callee. The format of this attribute is __attribute__((inline_intermix(yes|no))). gcc/ * doc/extend.texi: Document inline_intermix. * config/mips/mips

[PATCH 13/61] MIPS: Only split shifts if using -mdebugd

2025-01-31 Thread Aleksandar Rakic
From: Andrew Bennett Enable -mdebugd by default. Cherry-picked adb95984114b7636ee15f2ba79f94b028c8b35b2 from https://github.com/MIPS/gcc Signed-off-by: Andrew Bennett Signed-off-by: Faraz Shahbazker Signed-off-by: Aleksandar Rakic --- gcc/config/mips/mips.md | 1 + gcc/config/mips/mips.opt

[PATCH 33/61] Testsuite: Fix insn-*.c tests from trunk

2025-01-31 Thread Aleksandar Rakic
From: Matthew Fortune Ensure micromips test does not get confused about library support. Ensure insn-casesi.c and insn-tablejump.c can be executed. Move the micromips/mips16 selection into the file as per function attributes so that there is no requirement on having a full micromips or mips16 ru

[PATCH 56/61] Inefficient 64-bit signed modulo by powers of two

2025-01-31 Thread Aleksandar Rakic
From: Mihailo Stojanovic This adds the custom MIPS-specific modulo by power of two expander, which uses a modified algorithm, tailored to MIPS instruction set. gcc/ * config/mips/mips-protos.h (mips_expand_mod_pow2): New prototype. * config/mips/mips.cc (mips_rtx_costs): Don't force pow

[PATCH 26/61] Load/store bonding improvements

2025-01-31 Thread Aleksandar Rakic
From: Robert Suchanek gcc/ChangeLog: * config/mips/mips-protos.h (mips_load_store_bonding_p): New prototype. * config/mips/mips.cc (mips_load_store_bond_insns): New static function. (mips_block_move_straight): Bond insns where possible. (mips_for_e

[PATCH 58/61] Add EHB after last load if branch within 16 inst.

2025-01-31 Thread Aleksandar Rakic
From: "dragan.mladjenovic" This workaround adds -mfix-i6400 and -mfix-i6500. If any of those two options are active, it will add an EHB after the last load instruction in sequence if there is a branch within 16 instructions following it. Options have no effect on pre-R6 or compressed ISA targets

[PATCH 38/61] MIPSR6: Mark R6 unaligned access

2025-01-31 Thread Aleksandar Rakic
From: Matthew Fortune gcc/ * config/mips/mips.cc (mips_output_move): Mark unaligned load and store with a comment. Cherry-picked 42be7aa50f3b04a88768e08c000cfe7923f22b0f from https://github.com/MIPS/gcc Signed-off-by: Matthew Fortune Signed-off-by: Faraz Shahbazker Signed-off-

[PATCH 59/61] Add uclibc support

2025-01-31 Thread Aleksandar Rakic
From: Jean Lee Disable stack unwind and fix page size for uclibc on mips target. Fix "ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD." Disable SANITIZER_INTERCEPT_GLOB. Resolve libsanitizer bui

[PATCH 27/61] MIPSR6: Define new R6 FPU instructions

2025-01-31 Thread Aleksandar Rakic
From: Matthew Fortune gcc/ * config/mips/mips.h (ISA_HAS_FCLASS): New macro. (ISA_HAS_RINT): Likewise. * config/mips/mips.md (unspec): Add UNSPEC_FCLASS and UNSPEC_FRINT. (type) Add fclass and frint. (fnma4): Enable for ISA_HAS_FUSED_MADDF.

[PATCH 28/61] Fix wrong instruction in the delay slot

2025-01-31 Thread Aleksandar Rakic
From: Robert Suchanek The problematic test case shows that the use of __builtin_unreachable () has a branch not optimised away causing confusion in the eager delay slot filler if the "unreachable" is moved elsewhere by the block reordering pass. It appears that a series of unfortunate events cau

[PATCH 16/61] Add -msdata-num and -msdata-opt-list support

2025-01-31 Thread Aleksandar Rakic
From: Matthew Fortune Cherry-picked 2403e09c3a08b797e22e30f70f762ed1eadbd783 and f76b493c090cfc2f9270528e84ef0f04fb463c3f from https://github.com/MIPS/gcc Signed-off-by: Matthew Fortune Signed-off-by: Dragan Mladjenovic Signed-off-by: Faraz Shahbazker Signed-off-by: Aleksandar Rakic --- gcc

[PATCH 19/61] Add support for a limit for inlining memcpy

2025-01-31 Thread Aleksandar Rakic
From: Matthew Fortune Expose it with an option: -mblockmov-limit. A memcpy strictly less than this value will be considered for inlining. gcc/ChangeLog: * config/mips/mips.cc (mips_expand_block_move): Add support to control size of inlined memcpy. * config/mips/mips.opt

[PATCH 21/61] Testsuite: Modify the gcc.dg/memcpy-4.c test

2025-01-31 Thread Aleksandar Rakic
From: Andrew Bennett Firstly, remove the MIPS specific bit of the test. Secondly, create a MIPS specific version in the gcc.target/mips. This will only execute for a MIPS ISA less than R6. Cherry-picked c8b051cdbb1d5b166293513b0360d3d67cf31eb9 from https://github.com/MIPS/gcc Signed-off-by: And

[PATCH 18/61] Add -mfunc-opt-list=

2025-01-31 Thread Aleksandar Rakic
From: Simon Dardis New option for MIPS -mfunc-opt-list=FILE. This option takes a file which has one function per line followed by a whitespace (space/tab) followed by one or more attributes. Supported attributes are O2, Os, code-read=pcrel, always_inline, noinline, mips16, nomips16, epi, longcall

[PATCH 32/61] Account for LWL/LWR in store_by_pieces_p

2025-01-31 Thread Aleksandar Rakic
From: Matthew Fortune Cherry-picked 53d838794ad3379fdd8d1f3a812aa8f2dff56399 from https://github.com/MIPS/gcc Signed-off-by: Matthew Fortune Signed-off-by: Faraz Shahbazker Signed-off-by: Aleksandar Rakic --- gcc/config/mips/mips.cc | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-)

[PATCH 23/61] Add offset shrinking pass (-mshrink-offsets)

2025-01-31 Thread Aleksandar Rakic
From: mfortune This is derived from code produced by Steve Ellcey. This approach is slightly diverged from the original concept. It tries to adjust the base pointer to a common value and keep the costing lower than original by trying to find the best common value to trigger more 16-bit instruct

[PATCH 12/61] Add microMIPS R6 support

2025-01-31 Thread Aleksandar Rakic
From: Andrew Bennett Squashed commits: - Add umipsr6 compact branch support. - Multilib - microMIPS R6. - Don't think short micromips instructions are barriers. Some micromips insns have length of 2, but unfortuantely 2/4 returns 0, so the routine incorrectly thinks that the instruction is a b

[PATCH 10/61] Add -mgrow-frame-downwards

2025-01-31 Thread Aleksandar Rakic
From: mfortune Grow the local frame down instead of up for mips16 code size. By growing the frame downwards we get spill slots created at the lowest address rather than highest address in a local frame. The benefit being that when the frame is large the spill slots can still be accessed using a

[PATCH 46/61] nanoMIPS: unnecessary AND following an EXT

2025-01-31 Thread Aleksandar Rakic
From: "dragan.mladjenovic" The fwprop1 introduces a new use of Y by replacing the (subreg:QI (reg:SI X)) with (reg:QI Y) preventing the optimization of zero_extend later during the combine. This patch prevents this replacement in two new cases. A: (set (subreg:SI (reg:QI Y)) (z

[PATCH 43/61] Disable ssa-dom-cse-2.c for MIPS lp64

2025-01-31 Thread Aleksandar Rakic
From: Matthew Fortune The optimisation to reduce the result to constant 28 still happens but only much later in combine. gcc/testsuite/ * gcc.dg/tree-ssa/ssa-dom-cse-2.c: Do not check output for MIPS lp64 abi. Cherry-picked 7a9286a94817badb312e3bb2b4a7a83b8b3fa28a from https://g

[PATCH 37/61] Testsuite: Skip tests making calls to variables

2025-01-31 Thread Aleksandar Rakic
From: Matthew Fortune The compressed MIPS ISAs (microMIPS and MIPS16) require the LSB of an address to indicate which ISA to execute. The non-conformant patterns used in these tests cannot set the ISA mode bit and may attempt to directly call the variable which triggers an error from the assembl

[PATCH 36/61] Testsuite: Disable the time-profiler-2.c test

2025-01-31 Thread Aleksandar Rakic
From: Matthew Fortune gcc/testsuite/ * gcc.dg/tree-prof/time-profiler-2.c: Skip for mips* triples as it is unstable in simulation. Cherry-picked 7c5a494a31c72ee3285ffae9fda738aa875869b9 from https://github.com/MIPS/gcc Signed-off-by: Matthew Fortune Signed-off-by: Faraz Shahbaz

[PATCH 57/61] Implement synthesised conditional xor/or

2025-01-31 Thread Aleksandar Rakic
From: Mihailo Stojanovic Create an additional case for if-conversion which expands the following sequence: "if (test) x ^= C;" as a = 0; if (test) a = C; x ^= a; This reduces the number of necessary conditional moves on some targets (most notably MIPS). gcc/ * config/mips/mips.cc (mip

[PATCH 49/61] Make rtl if-conversion more common

2025-01-31 Thread Aleksandar Rakic
From: "dragan.mladjenovic" Tune ifcvt parameters, so that we get if-conversion in more cases. gcc/ * config/mips/mips.cc (mips_rtx_costs): Reduce cost of if_then_else pattern. (mips_max_noce_ifcvt_seq_cost): New function. Decrease maximum permissible cost for the

[PATCH 1/2] libstdc++: Fix return value of vector::insert_range

2025-01-31 Thread Patrick Palka
In some cases we're wrongly returning an iterator pointing to (one past) the last element inserted instead of to the first element inserted. libstdc++-v3/ChangeLog: * include/bits/stl_bvector.h (vector::insert_range): Consistently return an iterator pointing to the first element

[PATCH 61/61] Fix pr54240

2025-01-31 Thread Aleksandar Rakic
From: Chao-ying Fu gcc/testsuite/ * gcc.target/mips/pr54240.c: Scan phiopt2. Cherry-picked 02dd052d4822ca187af075f1fb5301c954844144 from https://github.com/MIPS/gcc Signed-off-by: Chao-ying Fu Signed-off-by: Aleksandar Rakic --- gcc/testsuite/gcc.target/mips/pr54240.c | 2 +- 1 file

[PATCH/GCC16 v2 1/1] AArch64: Emit half-precision FCMP/FCMPE

2025-01-31 Thread Spencer Abson
Enable a target with FEAT_FP16 to emit the half-precision variants of FCMP/FCMPE. gcc/ChangeLog: * config/aarch64/aarch64.md: Update cbranch, cstore, fcmp and fcmpe to use the GPF_F16 iterator for floating-point modes. gcc/testsuite/ChangeLog: * gcc.target/aarch6

[PATCH/GCC16 v2 0/1] AArch64: Emit half-precision FCMP/FCMPE

2025-01-31 Thread Spencer Abson
Applied the fixups suggested in the previous review, cheers. This patch allows the AArch64 back end to emit the half-precision variants of FCMP and FCMPE, given the target supports FEAT_FP16. Previously, such comparisons would be unnecessarily promoted to single-precision. The latest documentat

Re: [PATCH] icf: Compare call argument types in certain cases and asm operands [PR117432]

2025-01-31 Thread Jakub Jelinek
On Fri, Jan 31, 2025 at 01:38:36PM +0100, Richard Biener wrote: > > @@ -718,8 +720,11 @@ func_checker::compare_gimple_call (gcall > > > >/* For direct calls we verify that types are compatible so if we matched > > callees, callers must match, too. For indirect calls however verify > >

Re: [PATCH] OpenMP/Fortran: Add missing pop_state in parse_omp_dispatch

2025-01-31 Thread Paul-Antoine Arras
Pushed to master as obvious. This should fix PR118714. On 31/01/2025 11:46, Paul-Antoine Arras wrote: When the ST_NONE case is taken, the function returns immediately. Not calling pop_state causes a dangling pointer. gcc/fortran/ChangeLog: * parse.cc (parse_omp_dispatch): Add missing p

[Ada] Fix wrong elaboration for allocator at library level of dynamic library

2025-01-31 Thread Eric Botcazou
The problem was preexisting for class-wide allocators, but now occurs for allocators of controlled types on the mainline, because of the recent overhaul of the finalization machinery. Tested on x86-64/Linux, applied on the mainline. 2025-01-31 Eric Botcazou * gcc-interface/util

Re: [PATCH] icf: Compare call argument types in certain cases and asm operands [PR117432]

2025-01-31 Thread Richard Biener
On Fri, 31 Jan 2025, Jakub Jelinek wrote: > Hi! > > compare_operand uses operand_equal_p under the hood, which e.g. for > INTEGER_CSTs will just match the values rather regardless of their types. > Now, in many comparing the type is redundant, if we have > x_2 = y_3 + 1; > we've already compare

[PATCH][stage1] middle-end/80342 - genmatch optimize outer conversions

2025-01-31 Thread Richard Biener
The following improves genmatch generated code so we avoid more spurious SSA assignments to be pushed to the GIMPLE sequence or simplifications rejected when we're not supposed to produce any for outer and intermediate conversions. Bootstrapped and tested on x86_64-unknown-linux-gnu, queued for st

[committed] testsuite: Add testcase for already fixed PR [PR117498]

2025-01-31 Thread Jakub Jelinek
Hi! This wrong-code issue has been fixed with r15-7249. We still emit warnings which are questionable and perhaps we'd get better generated code if niters determined the loop has only a single iteration without UB and we'd punt on vectorizing it (or unrolling). Tested on x86_64-linux -m32/-m64, c

Re: [PATCH] libstdc++: Use canonical loop form in std::reduce

2025-01-31 Thread Jonathan Wakely
On Fri, 31 Jan 2025 at 12:48, Richard Biener wrote: > > On Fri, Jan 31, 2025 at 12:01 PM Abhishek Kaushik > wrote: > > > > From 4ac7c7e56e23ed2f4dd2dafdfab6cfa110c14260 Mon Sep 17 00:00:00 2001 > > From: Abhishek Kaushik > > Date: Fri, 31 Jan 2025 01:28:48 -0800 > > Subject: [PATCH] libstdc++: U

[PATCH 3/3] c++/modules: Handle exposures of TU-local types in uninstantiated member templates

2025-01-31 Thread Nathaniel Shead
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? Happy to remove the custom inform for lambdas, but I felt that the original message (which suggests that defining it within a class should make it OK) was unhelpful here. Similarly the 'is_exposure_of_member_type' function is not ne

Re: [PATCH v2] x86: Handle -mindirect-branch-register for -fno-plt

2025-01-31 Thread Uros Bizjak
On Fri, Jan 31, 2025 at 2:36 PM H.J. Lu wrote: > > -fno-plt forces external call to indirect call via GOT memory. But > -mindirect-branch-register requires indirect call and jump via register. > For -mindirect-branch-register, expanding indirect call via register and > update call patterns and pe

[PATCH 2/2] libstdc++: Fix flat_foo::insert_range for non-common ranges [PR118156]

2025-01-31 Thread Patrick Palka
This fixes flat_map/multimap::insert_range by simply generalizing the ::insert implementation to handle heterogenous iterator/sentinel pair. I'm not sure we can do better than this, e.g. we can't implement it in terms of the adapted containers' insert_range because that'd require two passes over th

[PATCH 48/61] Performance degradation for iDCT-4M example

2025-01-31 Thread Aleksandar Rakic
From: "dragan.mladjenovic" This workaround adds mfuse-vect-init option which causes the back-end to emit a single load for the vect_init if all the init elements come from the consecutive memory locations and are in the right order. gcc/ * config/mips/mips.cc (mips_fuse_vect_init_p): New

[PATCH 42/61] Remove redundant moves

2025-01-31 Thread Aleksandar Rakic
From: Robert Suchanek Add peepholes to remove silly moves. These reloads happens because of different modes making elimination non-trivial. Cherry-picked 85462a9dbf8d659bfb0417d354a0a4f9cd4b8e07 from https://github.com/MIPS/gcc Signed-off-by: Robert Suchanek Signed-off-by: Faraz Shahbazker Si

[PATCH 39/61] Frame barrier fix

2025-01-31 Thread Aleksandar Rakic
From: Matthew Fortune Ensure the frame barrier prevents reordering of stack pointer changes. It is possible for a load/store accessing the stack via a copy of the stack pointer to be moved across the epilogue meaning that it accesses stack that is no longer allocated. This leads to a situation w

Re: [PATCH 1/2] libstdc++: Fix return value of vector::insert_range

2025-01-31 Thread Patrick Palka
On Fri, 31 Jan 2025, Patrick Palka wrote: > In some cases we're wrongly returning an iterator pointing to (one past) > the last element inserted instead of to the first element inserted. > > libstdc++-v3/ChangeLog: > > * include/bits/stl_bvector.h (vector::insert_range): > Consistent

Re: [PATCH] Fortran: host association issue with symbol in COMMON block [PR108454]

2025-01-31 Thread Jerry D
On 1/30/25 1:44 PM, Harald Anlauf wrote: Dear all, analyzing the the PR (by Gerhard) turned out to two slightly related issues.  The first one, where a variable in a COMMON block is falsely resolved to a derived type declared in the host, leads to a false freeing of the symbol, resulting in memo

[PATCH 52/61] Fix register spill issue for soft-float glibc 2.29

2025-01-31 Thread Aleksandar Rakic
From: "dragan.mladjenovic" Adding the float-agnostic reproducer as test-case. gcc/testsuite/ * gcc.target/mips/tls-1.c: New file. Cherry-picked fa3b6a1347154973324d264e6ad2dbd66d3f0028 from https://github.com/MIPS/gcc Signed-off-by: Dragan Mladjenovic Signed-off-by: Faraz Shahb

[PATCH 60/61] Check anti-dependence between 0 and 3 for loads

2025-01-31 Thread Aleksandar Rakic
From: Chao-ying Fu gcc/ * config/mips/mips.md (join2_load_store): Check operand 0 and 3. Assert other two operands do not overlap after they are reordered. (*join2_loadhi): Same. Cherry-picked 63175687761e51dfe2f75dfab7b4de7f44bb4abe from https://github.com/MIPS/g

[PATCH 53/61] Inefficient scattered double precision load in MSA

2025-01-31 Thread Aleksandar Rakic
From: Mihailo Stojanovic gcc/ * config/mips/mips.cc (mips_legitimate_combined_insn): New function. Cherry-picked 092a39db956a418e7e020107b062c170ed976841 from https://github.com/MIPS/gcc Signed-off-by: Mihailo Stojanovic Signed-off-by: Faraz Shahbazker Signed-off-by: Aleksanda

[PATCH 54/61] fmadd.w should be restricted to mipsr6

2025-01-31 Thread Aleksandar Rakic
From: "dragan.mladjenovic" This patch prevents middle-end from using MSA fma on pre-r6 targets in order to avoid subtle inconsistencies with auto-vectorized code that might mix MSA fma with unfused scalar multiply-add. There might be Loongson targets that support MSA while having scalar multiply

[PATCH 45/61] Test float32-basic.c fails with -mabi=64 -EB

2025-01-31 Thread Aleksandar Rakic
From: "dragan.mladjenovic" Unlike float, the _Float32 value is passed w/o promotion when used as varargs parameter. On N32/64, the callee side expects it to be at offset 0 inside of 8-byte slot, which matches float behavior when passed on stack as named argument. Because of this, we need to make

[PATCH 25/61] Fix negative offset memory addressing

2025-01-31 Thread Aleksandar Rakic
From: Matthew Fortune Unconditionally set DONT_BREAK_DEPENDENCIES in scheduling flags. The code to break dependencies does not appear to provide a win under any circumstance and is often harmful. Disable it completely pending further investigation. gcc/ * config/mips/mips.cc (mips

[PATCH 11/61] Fix unsafe comparison against stack_pointer_rtx

2025-01-31 Thread Aleksandar Rakic
From: Andrew Bennett GCC can modify a rtx which was created using stack_pointer_rtx. This means that just doing a straight address comparision of a rtx against stack_pointer_rtx to see whether it is the stack pointer register will not be correct in all cases. This patch rewrites these comparison

[PATCH 20/61] Add -march=interaptiv-mr2 with MIPS16E2

2025-01-31 Thread Aleksandar Rakic
From: Robert Suchanek - Bugfix [MIPS16E2]: split of moves of negative constants should exclude zero const. - Add support for every style of ZEB/ZEH support that has been tried: An earlier attempt to improve generation of ZEB/ZEH led to a chaotic effect of sometimes generating the instructions a

[PATCH 34/61] Testsuite: Adjust tests to cope with -mips16

2025-01-31 Thread Aleksandar Rakic
From: Matthew Fortune Cherry-picked 38288a0fd125d70a7876763d7165f858d902 from https://github.com/MIPS/gcc Signed-off-by: Matthew Fortune Signed-off-by: Faraz Shahbazker Signed-off-by: Aleksandar Rakic --- .../gcc.target/mips/call-clobbered-2.c| 3 +- .../gcc.target/mips/call-cl

[PATCH 40/61] MIPSR6: Fix ICE occurred in R6 target

2025-01-31 Thread Aleksandar Rakic
From: Jaydeep Patil Fix ICE occurred in R6 target due to a clobber-list introduced in MADD/MSUB during combine pass. Cherry-picked 180f74c8ebdf13ddac806695d0333af7b924c402 from https://github.com/MIPS/gcc Signed-off-by: Jaydeep Patil Signed-off-by: Faraz Shahbazker Signed-off-by: Aleksandar R

[PATCH 24/61] P5600: Option -msched-weight added

2025-01-31 Thread Aleksandar Rakic
From: Jaydeep Patil Cherry-picked 0cf2542b41d8102800af180f0b6da1fe55a9d76b from https://github.com/MIPS/gcc Signed-off-by: Prachi Godbole Signed-off-by: Jaydeep Patil Signed-off-by: Faraz Shahbazker Signed-off-by: Aleksandar Rakic --- gcc/config/mips/mips.cc | 242 +

[PATCH 30/61] MSA: Make MSA and microMIPS R5 unsupported

2025-01-31 Thread Aleksandar Rakic
From: Matthew Fortune There are no platforms nor simulators for MSA and microMIPS R5 so turning off this support for now. gcc/ChangeLog: * config/mips/mips.cc (mips_option_override): Error out for -mmicromips -mmsa. Cherry-picked 1009d6ff7a8d3b56e0224a6b193c5a7b3c29aa5f from ht

[PATCH 51/61] Test solution on dspmac builtins

2025-01-31 Thread Aleksandar Rakic
From: Mihailo Stojanovic gcc/ * config/mips/mips.cc (mips_expand_builtin_insn): During expansion of DSP mac builtins, force the operands which correspond to the same inout register to have the same pseudo assigned. gcc/testsuite * gcc.target/mips/mac_zero_reload.c: New testcase

[PATCH 44/61] Autovectorization failures on BE targets

2025-01-31 Thread Aleksandar Rakic
From: "dragan.mladjenovic" GCC assumes that taking a vector mode B SUBREG of vector mode A register allows it to interpret its memory layout as if in A vector mode. We currently allow this mode change to be no-op on MSA registers. This works on little-endian because MSA register layout matches t

[PATCH 35/61] Testsuite: Use HAS_LDC instead of a specific ISA

2025-01-31 Thread Aleksandar Rakic
From: Matthew Fortune The call-clobbered-1.c test has both reasons to be above a certain ISA and below a certain ISA level. The option based ISA min/max code only triggers if there is no isa level request. gcc/testsuite/ * gcc.target/mips/call-clobbered-1.c: Use HAS_LDC ghost op

[PATCH 50/61] Fix MSA SUBREG moves on big-endian targets

2025-01-31 Thread Aleksandar Rakic
From: Mihailo Stojanovic This fixes the MSA implementation on big-endian targets which is essentially broken for things like SUBREG handling and calling convention for vector types. It borrows heavily from [1] as Aarch64 has the same problem with SVE vectors. Conceptually, register bitconverts s

[PATCH 47/61] Add -mmxu and -mno-mxu driver pass through

2025-01-31 Thread Aleksandar Rakic
From: Matthew Fortune Cherry-picked 9acbf0b0efdfcc27e30b1db7a707dbe9cc6b64eb from https://github.com/MIPS/gcc Signed-off-by: Matthew Fortune Signed-off-by: Faraz Shahbazker Signed-off-by: Aleksandar Rakic --- gcc/config/mips/mips.h | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/conf

[PATCH 55/61] Performance drop in mips-img-linux-gnu-gcc 7.x

2025-01-31 Thread Aleksandar Rakic
From: Mihailo Stojanovic gcc/ * config/mips/mips.cc (mips_rtx_costs): Reduce branch cost of conditional branches. (mips_prune_insertions_deletions): Target hook which checks whether a basic block is possibly if-convertible. Adjusts the insertion and deletio

[PATCH 41/61] Lightweight fix for shrink-wrapping inhibition

2025-01-31 Thread Aleksandar Rakic
From: Matthew Fortune This should be solved using the various PIC related macros such as PIC_OFFSET_TABLE_REGNUM and pic_offset_table_rtx but changing these is too dangerous without investigation. The lightweight fix for shrink-wrapping being inhibited by -mgpopt just clears the global pointer f

Re: [PATCH v2] x86: Handle -mindirect-branch-register for -fno-plt

2025-01-31 Thread H.J. Lu
On Fri, Jan 31, 2025 at 10:09 PM Uros Bizjak wrote: > > On Fri, Jan 31, 2025 at 2:54 PM Uros Bizjak wrote: > > > > On Fri, Jan 31, 2025 at 2:36 PM H.J. Lu wrote: > > > > > > -fno-plt forces external call to indirect call via GOT memory. But > > > -mindirect-branch-register requires indirect cal

Re: [PATCH][stage1] middle-end/80342 - genmatch optimize outer conversions

2025-01-31 Thread Andrew Pinski
On Fri, Jan 31, 2025 at 4:44 AM Richard Biener wrote: > > The following improves genmatch generated code so we avoid more > spurious SSA assignments to be pushed to the GIMPLE sequence or > simplifications rejected when we're not supposed to produce any > for outer and intermediate conversions. A

Re: [PATCH] libstdc++: Use canonical loop form in std::reduce

2025-01-31 Thread Jonathan Wakely
On Fri, 31 Jan 2025 at 14:47, Marc Glisse wrote: > > On Fri, 31 Jan 2025, Abhishek Kaushik wrote: > > > The current while loop in std::reduce and related functions is hard to > > vectorize because the loop control variable is hard to detect in icx. > > > > `while ((__last - __first) >= 4)` > > > >

Re: [PATCH v2] wwwdocs: add a Python postprocessing script

2025-01-31 Thread Gerald Pfeifer
On Wed, 29 Jan 2025, David Malcolm wrote: >> python3: can't open file '/www/gcc/htdocs- >> preformatted/bin/process_html.py': [Errno 2] No such file or directory >> bin/process_html.py failed; aborting. >> >> I tried replacing this with just "process_html.py" or >> "./process_html.py", >> alas

Re: [PATCH] libstdc++: Use canonical loop form in std::reduce

2025-01-31 Thread Abhishek Kaushik
* ICX needs to be improved here Yes, we're trying to fix this but I figure I could also try asking politely. * a user could write such code himself. But it still makes sense for std::reduce to be faster than a hand-written reduce because we assume that as users of stl :) __

  1   2   >