[PATCH] [x86_64]: Zhaoxin yongfeng enablement

2023-10-24 Thread mayshao
Hi all: This patch enables -march/-mtune=yongfeng, costs and tunings are set according to the characteristics of the processor. We add a new md file to describe yongfeng processor. Bootstrapped /regtested X86_64. Ok for trunk? BR Mayshao gcc/ChangeLog: * common/config/i386/

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-24 Thread Richard Biener
> Am 24.10.2023 um 22:38 schrieb Martin Uecker : > > Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao: >> Hi, Sid, >> >> Really appreciate for your example and detailed explanation. Very helpful. >> I think that this example is an excellent example to show (almost) all the >> iss

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-24 Thread Martin Uecker
Am Dienstag, dem 24.10.2023 um 22:51 + schrieb Qing Zhao: > > > On Oct 24, 2023, at 4:38 PM, Martin Uecker wrote: > > > > Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao: > > > Hi, Sid, > > > > > > Really appreciate for your example and detailed explanation. Very helpful. > > >

[RFC] RISC-V: elide sign extend when expanding cmp_and_jump

2023-10-24 Thread Vineet Gupta
RV64 comapre and branch instructions only support 64-bit operands. The backend unconditionally generates zero/sign extend at Expand time for compare and branch operands even if they are already as such e.g. function args which ABI guarantees to be sign-extended (at callsite). And subsequently REE

[PATCH] Improve tree_expr_nonnegative_p by using the ranger [PR111959]

2023-10-24 Thread Andrew Pinski
I noticed we were missing optimizing `a / (1 << b)` when we know that a is nonnegative but only due to ranger information. This adds the use of the global ranger to tree_single_nonnegative_warnv_p for SSA_NAME. I didn't extend tree_single_nonnegative_warnv_p to use the ranger for floating point nor

[PATCH] match: Simplify `a != C1 ? abs(a) : C2` when C2 == abs(C1) [PR111957]

2023-10-24 Thread Andrew Pinski
This adds a match pattern for `a != C1 ? abs(a) : C2` which gets simplified to `abs(a)`. if C1 was originally *_MIN then change it over to use absu instead of abs. Bootstrapped and tested on x86_64-linux-gnu with no regressions. PR tree-optimization/111957 gcc/ChangeLog: * match

PING^3 [PATCH v2] rs6000: Don't use optimize_function_for_speed_p too early [PR108184]

2023-10-24 Thread Kewen.Lin
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609993.html BR, Kewen >> on 2023/1/16 17:08, Kewen.Lin via Gcc-patches wrote: >>> Hi, >>> >>> As Honza pointed out in [1], the current uses of function >>> optimize_function_for_speed_p in rs6000_option_override_intern

PING^5 [PATCH 0/9] rs6000: Rework rs6000_emit_vector_compare

2023-10-24 Thread Kewen.Lin
Hi, Gentle ping this series: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607146.html BR, Kewen on 2022/11/24 17:15, Kewen Lin wrote: > Hi, > > Following Segher's suggestion, this patch series is to rework > function rs6000_emit_vector_compare for vector float an

[PATCH v3] sched: Change no_real_insns_p to no_real_nondebug_insns_p [PR108273]

2023-10-24 Thread Kewen.Lin
Hi, This is almost a repost for v2 which was posted at[1] in March excepting for: 1) rebased from r14-4810 which is relatively up-to-date, some conflicts on "int to bool" return type change have been resolved; 2) adjust commit log a bit; 3) fix misspelled "articial" with "artificia

Re: [PATCH 3/3]rs6000: split complicate constant to constant pool

2023-10-24 Thread Kewen.Lin
Hi, on 2023/10/25 10:00, Jiufu Guo wrote: > Hi, > > Sometimes, a complicated constant is built via 3(or more) > instructions to build. Generally speaking, it would not be > as faster as loading it from the constant pool (as a few > discussions in PR63281). I may miss some previous discussions, b

Re: [PATCH 2/3]rs6000: using 'pli' to load 34bit-constant

2023-10-24 Thread Kewen.Lin
on 2023/10/25 10:00, Jiufu Guo wrote: > Hi, > > For constants with 16bit values, 'li or lis' can be used to generate > the value. For 34bit constant, 'pli' is ok to generate the value. > > Bootstrap®test pass on ppc64{,le}. > Is this ok for trunk? > > BR, > Jeff (Jiufu Guo) > > gcc/ChangeLog:

Re: [PATCH 1/3]rs6000: update num_insns_constant for 2 insns

2023-10-24 Thread Kewen.Lin
Hi, on 2023/10/25 10:00, Jiufu Guo wrote: > Hi, > > Trunk gcc supports more constants to be built via two instructions: e.g. > "li/lis; xori/xoris/rldicl/rldicr/rldic". > And then num_insns_constant should also be updated. > Thanks for updating this. > Bootstrap & regtest pass ppc64{,le}. > Is

[PATCH 3/3]rs6000: split complicate constant to constant pool

2023-10-24 Thread Jiufu Guo
Hi, Sometimes, a complicated constant is built via 3(or more) instructions to build. Generally speaking, it would not be as faster as loading it from the constant pool (as a few discussions in PR63281). For the concern that I raised in: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599676

[PATCH 2/3]rs6000: using 'pli' to load 34bit-constant

2023-10-24 Thread Jiufu Guo
Hi, For constants with 16bit values, 'li or lis' can be used to generate the value. For 34bit constant, 'pli' is ok to generate the value. Bootstrap®test pass on ppc64{,le}. Is this ok for trunk? BR, Jeff (Jiufu Guo) gcc/ChangeLog: * config/rs6000/rs6000.cc (rs6000_emit_set_long_const

[PATCH 1/3]rs6000: update num_insns_constant for 2 insns

2023-10-24 Thread Jiufu Guo
Hi, Trunk gcc supports more constants to be built via two instructions: e.g. "li/lis; xori/xoris/rldicl/rldicr/rldic". And then num_insns_constant should also be updated. Bootstrap & regtest pass ppc64{,le}. Is this ok for trunk? BR, Jeff (Jiufu Guo) gcc/ChangeLog: * config/rs6000/rs60

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-24 Thread Siddhesh Poyarekar
On 2023-10-24 18:51, Qing Zhao wrote: Thanks for the proposal! So what you suggested is: For every x.buf, change it as a __builtin_with_size(x.buf, x.L) in the FE, then the call to the _bdos (x.buf, 1) will Become: _bdos(__builtin_with_size(x.buf, x.L), 1)? Then the implicit use of x.L

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-24 Thread Siddhesh Poyarekar
On 2023-10-24 18:41, Qing Zhao wrote: On Oct 24, 2023, at 5:03 PM, Siddhesh Poyarekar wrote: On 2023-10-24 16:30, Qing Zhao wrote: Situation 2: With O0, the routine “get_size_from” was NOT inlined into “foo”, therefore, the call to __bdos is Not in the same routine as the instantiation of

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-24 Thread Qing Zhao
> On Oct 24, 2023, at 4:38 PM, Martin Uecker wrote: > > Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao: >> Hi, Sid, >> >> Really appreciate for your example and detailed explanation. Very helpful. >> I think that this example is an excellent example to show (almost) all the >> i

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-24 Thread Qing Zhao
> On Oct 24, 2023, at 5:03 PM, Siddhesh Poyarekar wrote: > > On 2023-10-24 16:30, Qing Zhao wrote: >> Situation 2: With O0, the routine “get_size_from” was NOT inlined into >> “foo”, therefore, the call to __bdos is Not in the same routine as the >> instantiation of the object, As a result, t

Re: [PATCH v2 3/3] c++: note other candidates when diagnosing deletedness

2023-10-24 Thread Jason Merrill
On 10/23/23 19:51, Patrick Palka wrote: With the previous two patches in place, we can now extend our deletedness diagnostic to note the other considered candidates, e.g.: deleted16.C: In function 'int main()': deleted16.C:10:4: error: use of deleted function 'void f(int)' 10 | f(0

Re: [PATCH v2 2/3] c++: remember candidates that we ignored

2023-10-24 Thread Jason Merrill
On 10/23/23 19:51, Patrick Palka wrote: During overload resolution, we sometimes outright ignore a function from the overload set and leave no trace of it in the candidates list, for example when we find a perfect non-template candidate we discard all function templates, or when the callee is a t

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-24 Thread Richard Sandiford
Robin Dapp writes: > The attached patch introduces a VCOND_MASK_LEN, helps for the riscv cases > that were broken before and looks unchanged on x86, aarch64 and power > bootstrap and testsuites. > > I only went with the minimal number of new match.pd patterns and did not > try stripping the length

Re: PR111754

2023-10-24 Thread Richard Sandiford
Hi, Sorry the slow review. I clearly didn't think this through properly when doing the review of the original patch, so I wanted to spend some time working on the code to get a better understanding of the problem. Prathamesh Kulkarni writes: > Hi, > For the following test-case: > > typedef floa

Re: [PATCH v2 1/3] c++: sort candidates according to viability

2023-10-24 Thread Jason Merrill
On 10/23/23 19:51, Patrick Palka wrote: The second patch in this series is new and ensures that the candidates list isn't mysteriously missing some candidates when noting other candidates due to deletedness. -- >8 -- This patch: * changes splice_viable to move the non-viable candidates to t

Re: [PATCH] c++: build_new_1 and non-dep array size [PR111929]

2023-10-24 Thread Jason Merrill
On 10/24/23 13:03, Patrick Palka wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look like the right approach? -- >8 -- This PR is another instance of NON_DEPENDENT_EXPR having acted as an "analysis barrier" for middle-end routines, and now that it's gone we may end up passi

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-24 Thread Siddhesh Poyarekar
On 2023-10-24 16:38, Martin Uecker wrote: Here is another proposal: Add a new builtin function __builtin_with_size(x, size) that return x but behaves similar to an allocation function in that BDOS can look at the size argument to discover the size. The FE insers this function when the field i

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-24 Thread Siddhesh Poyarekar
On 2023-10-24 16:30, Qing Zhao wrote: Situation 2: With O0, the routine “get_size_from” was NOT inlined into “foo”, therefore, the call to __bdos is Not in the same routine as the instantiation of the object, As a result, the TYPE info and the attached counted_by info of the object can NOT be

Re: [PATCH] c++: error with bit-fields and scoped enums [PR111895]

2023-10-24 Thread Marek Polacek
On Tue, Oct 24, 2023 at 04:46:02PM -0400, Jason Merrill wrote: > On 10/24/23 12:18, Marek Polacek wrote: > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? > > > > -- >8 -- > > Here we issue a bogus error: invalid operands of types 'unsigned char:2' > > and 'int' to binary 'operator!

Re: [PATCH v9 4/4] ree: Improve ree pass for rs6000 target using defined ABI interfaces

2023-10-24 Thread Vineet Gupta
On 10/24/23 13:36, rep.dot@gmail.com wrote: As said, I don't see why the below was not cleaned up before the V1 submission. Iff it breaks when manually CSEing, I'm curious why? The function below looks identical in v12 of the patch. Why didn't you use common subexpressions? ba Using CSE her

Re: [PATCH] c++: error with bit-fields and scoped enums [PR111895]

2023-10-24 Thread Jason Merrill
On 10/24/23 12:18, Marek Polacek wrote: Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? -- >8 -- Here we issue a bogus error: invalid operands of types 'unsigned char:2' and 'int' to binary 'operator!=' when casting a bit-field of scoped enum type to bool. In build_static_cast_1, p

Re: [PATCH] Fortran/OpenMP: event handle in task detach cannot be a coarray [PR104131]

2023-10-24 Thread Harald Anlauf
Dear all, Tobias argued in the PR that the testcase should actually be valid. Therefore withdrawing the patch. Sorry for expecting this to be a low-hanging fruit... Harald On 10/24/23 22:23, rep.dot@gmail.com wrote: On 24 October 2023 21:25:01 CEST, Harald Anlauf wrote: Dear all, the a

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-24 Thread Martin Uecker
Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao: > Hi, Sid, > > Really appreciate for your example and detailed explanation. Very helpful. > I think that this example is an excellent example to show (almost) all the > issues we need to consider. > > I slightly modified this example

Re: [PATCH v9 4/4] ree: Improve ree pass for rs6000 target using defined ABI interfaces

2023-10-24 Thread rep . dot . nop
On 24 October 2023 09:36:22 CEST, Ajit Agarwal wrote: >Hello Bernhard: > >On 23/10/23 7:40 pm, Bernhard Reutner-Fischer wrote: >> On Mon, 23 Oct 2023 12:16:18 +0530 >> Ajit Agarwal wrote: >> >>> Hello All: >>> >>> Addressed below review comments in the version 11 of the patch. >>> Please review

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-24 Thread Qing Zhao
Hi, Sid, Really appreciate for your example and detailed explanation. Very helpful. I think that this example is an excellent example to show (almost) all the issues we need to consider. I slightly modified this example to make it to be compilable and run-able, as following: (but I still canno

Re: [PATCH] libstdc++ Add cstdarg to freestanding

2023-10-24 Thread Jonathan Wakely
On Sun, 22 Oct 2023 at 21:06, Arsen Arsenović wrote: > > "Paul M. Bendixen" writes: > > > Updated patch, added the requested files, hopefully wrote the commit > better. > > LGTM. Jonathan? > Yup, looks good. I've pushed it to trunk with a tweaked changelog entry. I'll backport it to gcc-13 soo

Re: [PATCH] Fortran/OpenMP: event handle in task detach cannot be a coarray [PR104131]

2023-10-24 Thread rep . dot . nop
On 24 October 2023 21:25:01 CEST, Harald Anlauf wrote: >Dear all, > >the attached simple patch adds a forgotten check that an event handle >cannot be a coarray. This case appears to have been overlooked in the >original fix for this PR. > >I intend to commit as obvious within 24h unless there are

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-24 Thread Robin Dapp
Changed as suggested. The difference to v5 is thus: + if (cond_fn_p) + { + gcall *call = dyn_cast (use_stmt); + unsigned else_pos + = internal_fn_else_index (internal_fn (op.code)); + + for (unsigned int j = 0; j < gimple_call_nu

[PATCH] Fortran/OpenMP: event handle in task detach cannot be a coarray [PR104131]

2023-10-24 Thread Harald Anlauf
Dear all, the attached simple patch adds a forgotten check that an event handle cannot be a coarray. This case appears to have been overlooked in the original fix for this PR. I intend to commit as obvious within 24h unless there are comments. Thanks, Harald From 2b5ed32cacfe84dc4df74b4dccf16a

Re: [PATCH v3] gcc: Introduce -fhardened

2023-10-24 Thread Iain Sandoe
> On 24 Oct 2023, at 20:03, Marek Polacek wrote: > > On Tue, Oct 24, 2023 at 10:34:22AM +0100, Iain Sandoe wrote: >> hi Marek, >> >>> On 24 Oct 2023, at 08:44, Iain Sandoe wrote: >>> On 23 Oct 2023, at 20:25, Marek Polacek wrote: On Thu, Oct 19, 2023 at 02:24:11PM +0200, Richard

Re: [PATCH] gcov-io.h: fix comment regarding length of records

2023-10-24 Thread Jose E. Marchesi
> On 10/24/23 06:41, Jose E. Marchesi wrote: >> The length of gcov records is stored as a signed 32-bit number of >> bytes. >> Ok? > OK. Pushed. Thanks.

Re: [PATCH v3] gcc: Introduce -fhardened

2023-10-24 Thread Marek Polacek
On Tue, Oct 24, 2023 at 09:22:25AM +0200, Richard Biener wrote: > On Mon, Oct 23, 2023 at 9:26 PM Marek Polacek wrote: > > > > On Thu, Oct 19, 2023 at 02:24:11PM +0200, Richard Biener wrote: > > > On Wed, Oct 11, 2023 at 10:48 PM Marek Polacek wrote: > > > > > > > > On Tue, Sep 19, 2023 at 10:58:

Re: [PATCH v3] gcc: Introduce -fhardened

2023-10-24 Thread Marek Polacek
On Tue, Oct 24, 2023 at 10:34:22AM +0100, Iain Sandoe wrote: > hi Marek, > > > On 24 Oct 2023, at 08:44, Iain Sandoe wrote: > > On 23 Oct 2023, at 20:25, Marek Polacek wrote: > >> > >> On Thu, Oct 19, 2023 at 02:24:11PM +0200, Richard Biener wrote: > >>> On Wed, Oct 11, 2023 at 10:48 PM Marek P

[PATCH] Add a late-combine pass [PR106594]

2023-10-24 Thread Richard Sandiford
This patch adds a combine pass that runs late in the pipeline. There are two instances: one between combine and split1, and one after postreload. The pass currently has a single objective: remove definitions by substituting into all uses. The pre-RA version tries to restrict itself to cases that

Re: [PATCH 3/3] rtl-ssa: Add new helper functions

2023-10-24 Thread Jeff Law
On 10/24/23 11:58, Richard Sandiford wrote: This patch adds some RTL-SSA helper functions. They will be used by the upcoming late-combine pass. The patch contains the first non-template out-of-line function declared in movement.h, so it adds a movement.cc. I realise it seems a bit over-the-

Re: [PATCH 2/3] rtl-ssa: Extend make_uses_available

2023-10-24 Thread Jeff Law
On 10/24/23 11:58, Richard Sandiford wrote: The first in-tree use of RTL-SSA was fwprop, and one of the goals was to make the fwprop rewrite preserve the old behaviour as far as possible. The switch to RTL-SSA was supposed to be a pure infrastructure change. So RTL-SSA has various FIXMEs for

Re: [PATCH 1/3] rtl-ssa: Use frequency-weighted insn costs

2023-10-24 Thread Jeff Law
On 10/24/23 11:58, Richard Sandiford wrote: rtl_ssa::changes_are_worthwhile used the standard approach of summing up the individual costs of the old and new sequences to see which one is better overall. But when optimising for speed and changing instructions in multiple blocks, it seems bette

Re: [PATCH V14 4/4] ree: Improve ree pass using defined abi interfaces

2023-10-24 Thread Vineet Gupta
On 10/24/23 10:03, Ajit Agarwal wrote: Hello Vineet, Jeff and Bernhard: This version 14 of the patch uses abi interfaces to remove zero and sign extension elimination. This fixes aarch64 regressions failures with aggressive CSE. Once again, this information belong between the two "---" lin

Re: [PATCH] testsuite: Fix _BitInt in gcc.misc-tests/godump-1.c

2023-10-24 Thread Jeff Law
On 10/24/23 09:26, Stefan Schulze Frielinghaus wrote: Currently _BitInt is only supported on x86_64 which means that for other targets all tests fail with e.g. gcc.misc-tests/godump-1.c:237:1: sorry, unimplemented: '_BitInt(32)' is not supported on this target 237 | _BitInt(32) b32_v;

Re: [PATCH] gcov-io.h: fix comment regarding length of records

2023-10-24 Thread Jeff Law
On 10/24/23 06:41, Jose E. Marchesi wrote: The length of gcov records is stored as a signed 32-bit number of bytes. Ok? OK. jeff

Re: [PATCH] recog/reload: Remove old UNARY_P operand support

2023-10-24 Thread Jeff Law
On 10/24/23 04:14, Richard Sandiford wrote: reload and constrain_operands had some old code to look through unary operators. E.g. an operand could be (sign_extend (reg X)), and the constraints would match the reg rather than the sign_extend. > This was previously used by the MIPS port. But r

[PATCH 3/3] rtl-ssa: Add new helper functions

2023-10-24 Thread Richard Sandiford
This patch adds some RTL-SSA helper functions. They will be used by the upcoming late-combine pass. The patch contains the first non-template out-of-line function declared in movement.h, so it adds a movement.cc. I realise it seems a bit over-the-top to have a file with just one function, but it

[PATCH 2/3] rtl-ssa: Extend make_uses_available

2023-10-24 Thread Richard Sandiford
The first in-tree use of RTL-SSA was fwprop, and one of the goals was to make the fwprop rewrite preserve the old behaviour as far as possible. The switch to RTL-SSA was supposed to be a pure infrastructure change. So RTL-SSA has various FIXMEs for things that were artifically limited to faciliat

[PATCH 1/3] rtl-ssa: Use frequency-weighted insn costs

2023-10-24 Thread Richard Sandiford
rtl_ssa::changes_are_worthwhile used the standard approach of summing up the individual costs of the old and new sequences to see which one is better overall. But when optimising for speed and changing instructions in multiple blocks, it seems better to weight the cost of each instruction by its e

[PATCH 0/3] rtl-ssa: Various extensions for the late-combine pass

2023-10-24 Thread Richard Sandiford
This series adds some RTL-SSA enhancements that are needed by the late-combine pass. Tested on aarch64-linux-gnu & x86_64-linux-gnu. OK to install? Richard Richard Sandiford (3): rtl-ssa: Use frequency-weighted insn costs rtl-ssa: Extend make_uses_available rtl-ssa: Add new helper functio

Re: [PATCH 6/6] rtl-ssa: Handle call clobbers in more places

2023-10-24 Thread Jeff Law
On 10/24/23 04:50, Richard Sandiford wrote: In order to save (a lot of) memory, RTL-SSA avoids creating individual clobber records for every call-clobbered register. It instead maintains a list & splay tree of calls in an EBB, grouped by ABI. This patch takes these call clobbers into account

Re: [PATCH 5/6] rtl-ssa: Calculate dominance frontiers for the exit block

2023-10-24 Thread Jeff Law
On 10/24/23 04:50, Richard Sandiford wrote: The exit block can have multiple predecessors, for example if the function calls __builtin_eh_return. We might then need PHI nodes for values that are live on exit. RTL-SSA uses the normal dominance frontiers approach for calculating where PHI node

[PATCH v2] AArch64: Improve immediate generation

2023-10-24 Thread Wilco Dijkstra
v2: Use check-function-bodies in tests Further improve immediate generation by adding support for 2-instruction MOV/EOR bitmask immediates. This reduces the number of 3/4-instruction immediates in SPECCPU2017 by ~2%. Passes regress, OK for commit? gcc/ChangeLog: * config/aarch64/aarch64

Re: [PATCH 4/6] rtl-ssa: Handle artifical uses of deleted defs

2023-10-24 Thread Jeff Law
On 10/24/23 04:50, Richard Sandiford wrote: If an optimisation removes the last real use of a definition, there can still be artificial uses left. This patch removes those uses too. These artificial uses exist because RTL-SSA is only an SSA-like view of the existing RTL IL, rather than a nat

Re: [PATCH 3/6] rtl-ssa: Fix ICE when deleting memory clobbers

2023-10-24 Thread Jeff Law
On 10/24/23 04:50, Richard Sandiford wrote: Sometimes an optimisation can remove a clobber of scratch registers or scratch memory. We then need to update the DU chains to reflect the removed clobber. For registers this isn't a problem. Clobbers of registers are just momentary blips in the r

Re: [PATCH 2/6] rtl-ssa: Create REG_UNUSED notes after all pending changes

2023-10-24 Thread Jeff Law
On 10/24/23 04:50, Richard Sandiford wrote: Unlike REG_DEAD notes, REG_UNUSED notes need to be kept free of false positives by all passes. function_info::change_insns does this by removing all REG_UNUSED notes, and then using add_reg_unused_notes to add notes back (or create new ones) where a

Re: [PATCH 1/6] rtl-ssa: Ensure global registers are live on exit

2023-10-24 Thread Jeff Law
On 10/24/23 04:50, Richard Sandiford wrote: RTL-SSA mostly relies on DF for block-level register liveness information, including artificial uses and defs at the beginning and end of blocks. But one case was missing. DF does not add artificial uses of global registers to the beginning or end

[PATCH V14 4/4] ree: Improve ree pass using defined abi interfaces

2023-10-24 Thread Ajit Agarwal
Hello Vineet, Jeff and Bernhard: This version 14 of the patch uses abi interfaces to remove zero and sign extension elimination. This fixes aarch64 regressions failures with aggressive CSE. Bootstrapped and regtested on powerpc-linux-gnu. In this version (version 14) of the patch following revi

[PATCH] c++: build_new_1 and non-dep array size [PR111929]

2023-10-24 Thread Patrick Palka
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look like the right approach? -- >8 -- This PR is another instance of NON_DEPENDENT_EXPR having acted as an "analysis barrier" for middle-end routines, and now that it's gone we may end up passing weird templated trees (that have a gene

Re: [PATCH] c++: cp_stabilize_reference and non-dep exprs [PR111919]

2023-10-24 Thread Jason Merrill
On 10/23/23 19:49, Patrick Palka wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? -- >8 -- After the removal of NON_DEPENDENT_EXPR, cp_stabilize_reference which used to just exit early for NON_DEPENDENT_EXPR is now more prone to passing a weird templated tr

[PATCH] c++: error with bit-fields and scoped enums [PR111895]

2023-10-24 Thread Marek Polacek
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? -- >8 -- Here we issue a bogus error: invalid operands of types 'unsigned char:2' and 'int' to binary 'operator!=' when casting a bit-field of scoped enum type to bool. In build_static_cast_1, perform_direct_initialization_if_possible r

Re: [PATCH] Fortran: Fix incompatible types between INTEGER(8) and TYPE(c_ptr)

2023-10-24 Thread Tobias Burnus
Hi PA, hello all, First, I hesitate to review/approve a patch I am involved in; Thus, I would like if someone could have a second look. Regarding the patch itself: On 20.10.23 16:02, Paul-Antoine Arraswrote: Hi all, The attached patch fixes a bug that causes valid OpenMP declare variant dire

[PATCH] config, aarch64: Use a more compatible sed invocation.

2023-10-24 Thread Iain Sandoe
Although this came up initially when working on the Darwin Arm64 port, it also breaks cross-compilers on platforms with non-GNU sed. Tested on x86_64-darwin X aarch64-linux-gnu, aarch64-darwin, aarch64-linux-gnu and x86_64-linux-gnu. OK for master? thanks, Iain --- 8< --- Currently, the sed com

[PATCH] testsuite: Fix _BitInt in gcc.misc-tests/godump-1.c

2023-10-24 Thread Stefan Schulze Frielinghaus
Currently _BitInt is only supported on x86_64 which means that for other targets all tests fail with e.g. gcc.misc-tests/godump-1.c:237:1: sorry, unimplemented: '_BitInt(32)' is not supported on this target 237 | _BitInt(32) b32_v; | ^~~ Instead of requiring _BitInt support for godum

Re: [PATCH] RISC-V: Add AVL propagation PASS for RVV auto-vectorization

2023-10-24 Thread Kito Cheng
> +using namespace rtl_ssa; > +using namespace riscv_vector; > + > +/* The AVL propagation instructions and corresponding preferred AVL. > + It will be updated during the analysis. */ > +static hash_map *avlprops; Maybe put into member data of pass_avlprop? > + > +const pass_data pass_data_avl

Re: [PING][PATCH 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2023-10-24 Thread Richard Sandiford
Sorry for the slow review. I had a look at the arm bits too, to get some context for the target-independent bits. Stamatis Markianos-Wright via Gcc-patches writes: > [...] > diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h > index 77e76336e94..74186930f0b 100644 > --- a/gcc

Re: [PATCH v1 1/1] gcc: config: microblaze: fix cpu version check

2023-10-24 Thread Michael Eager
On 10/24/23 00:01, Frager, Neal wrote: There is a microblaze cpu version 10.0 included in versal. If the minor version is only a single digit, then the version comparison will fail as version 10.0 will appear as 100 compared to version 6.00 or 8.30 which will calculate to values 600 and 830. The

[RFC PATCH] Detecting lifetime-dse issues via Valgrind

2023-10-24 Thread exactlywb
From: Daniil Frolov PR 66487 is asking to provide sanitizer-like detection for C++ object lifetime violations that are worked around with -fno-lifetime-dse in Firefox, LLVM, OpenJade. The discussion in the PR was centered around extending MSan, but MSan was not ported to GCC (and requires rebuil

Re: [PATCH] testsuite: Fix gcc.target/arm/mve/mve_vadcq_vsbcq_fpscr_overwrite.c

2023-10-24 Thread Richard Earnshaw
On 08/09/2023 09:43, Christophe Lyon via Gcc-patches wrote: The test was declaring 'int *carry;' and wrote to '*carry' without initializing 'carry' first, leading to an attempt to write at address zero, and a crash. Fix by declaring 'int carry;' and passing '&carrry' instead of 'carry' as par

Re: Re: [PATCH v2] RISC-V: Fix ICE of RVV vget/vset intrinsic[PR111935]

2023-10-24 Thread Kito Cheng
Ok for gcc 13 but just wait one more week to make sure everything is fine as gcc convention :) Li Xu 於 2023年10月24日 週二,15:49寫道: > Committed to trunk. Thanks juzhe. > > > -- > > > > Li Xu > > > > >Ok for trunk (You can commit it to the trunk now). > > > > > > > > > >For GCC-13, I'd lik

Re: Re: [PATCH V5] VECT: Enhance SLP of MASK_LEN_GATHER_LOAD[PR111721]

2023-10-24 Thread juzhe.zh...@rivai.ai
Hi, Richard. Assertion failed at this IR: _427 = _425 & _426; _429 = present$0_16(D) != 0; _430 = _425 & _429; _409 = _430 | _445; _410 = _409 | _449; _411 = .LOOP_VECTORIZED (3, 6); if (_411 != 0) goto ; [100.00%] else goto ; [100.00%] [local count: 3280550]: [lo

[committed] arc: Remove mpy_dest_reg_operand predicate

2023-10-24 Thread Claudiu Zissulescu
The mpy_dest_reg_operand is just a wrapper for register_operand. Remove it. gcc/ * config/arc/arc.md (mulsi3_700): Update pattern. (mulsi3_v2): Likewise. * config/arc/predicates.md (mpy_dest_reg_operand): Remove it. Signed-off-by: Claudiu Zissulescu --- gcc/config/arc/a

[PATCH] gcov-io.h: fix comment regarding length of records

2023-10-24 Thread Jose E. Marchesi
The length of gcov records is stored as a signed 32-bit number of bytes. Ok? diff --git a/gcc/gcov-io.h b/gcc/gcov-io.h index bfe4439d02d..e6f33e32652 100644 --- a/gcc/gcov-io.h +++ b/gcc/gcov-io.h @@ -101,7 +101,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see Rec

Re: [x86 PATCH] Fine tune STV register conversion costs for -Os.

2023-10-24 Thread Uros Bizjak
On Mon, Oct 23, 2023 at 4:47 PM Roger Sayle wrote: > > > The eagle-eyed may have spotted that my recent testcases for DImode shifts > on x86_64 included -mno-stv in the dg-options. This is because the > Scalar-To-Vector (STV) pass currently transforms these shifts to use > SSE vector operations,

[PATCH GCC13 backport] Avoid compile time hog on vect_peel_nonlinear_iv_init for nonlinear induction vec_step_op_mul when iteration count is too big.

2023-10-24 Thread liuhongt
This is the backport patch for releases/gcc-13 branch, the original patch for main trunk is at [1]. The only difference between this backport patch and [1] is GCC13 doesn't support auto_mpz, So this patch manually use mpz_init/mpz_clear. [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-October

Re: [PATCH] i386: Fix undefined masks in vpopcnt tests

2023-10-24 Thread Hongtao Liu
On Tue, Oct 24, 2023 at 6:10 PM Richard Sandiford wrote: > > The files changed in this patch had tests for masked and unmasked > popcnt. However, the mask inputs to the masked forms were undefined, > and would be set to zero by init_regs. Any combine-like pass that > ran after init_regs could th

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-24 Thread Richard Sandiford
Richard Biener writes: > On Thu, 19 Oct 2023, Robin Dapp wrote: > >> Ugh, I didn't push yet because with a rebased trunk I am >> seeing different behavior for some riscv testcases. >> >> A reduction is not recognized because there is yet another >> "double use" occurrence in check_reduction_path.

Re: [PATCH v25 25/33] libstdc++: Optimize std::is_function compilation performance

2023-10-24 Thread Jonathan Wakely
On Tue, 24 Oct 2023 at 03:16, Ken Matsui wrote: > This patch optimizes the compilation performance of std::is_function > by dispatching to the new __is_function built-in trait. > > libstdc++-v3/ChangeLog: > > * include/std/type_traits (is_function): Use __is_function > built-in tr

[PATCH 4/6] rtl-ssa: Handle artifical uses of deleted defs

2023-10-24 Thread Richard Sandiford
If an optimisation removes the last real use of a definition, there can still be artificial uses left. This patch removes those uses too. These artificial uses exist because RTL-SSA is only an SSA-like view of the existing RTL IL, rather than a native SSA representation. It effectively treats RTL

[PATCH 2/6] rtl-ssa: Create REG_UNUSED notes after all pending changes

2023-10-24 Thread Richard Sandiford
Unlike REG_DEAD notes, REG_UNUSED notes need to be kept free of false positives by all passes. function_info::change_insns does this by removing all REG_UNUSED notes, and then using add_reg_unused_notes to add notes back (or create new ones) where appropriate. The problem was that it called add_r

[PATCH 6/6] rtl-ssa: Handle call clobbers in more places

2023-10-24 Thread Richard Sandiford
In order to save (a lot of) memory, RTL-SSA avoids creating individual clobber records for every call-clobbered register. It instead maintains a list & splay tree of calls in an EBB, grouped by ABI. This patch takes these call clobbers into account in a couple more routines. I don't think this wi

[PATCH 3/6] rtl-ssa: Fix ICE when deleting memory clobbers

2023-10-24 Thread Richard Sandiford
Sometimes an optimisation can remove a clobber of scratch registers or scratch memory. We then need to update the DU chains to reflect the removed clobber. For registers this isn't a problem. Clobbers of registers are just momentary blips in the register's lifetime. They act as a barrier for mo

[PATCH 1/6] rtl-ssa: Ensure global registers are live on exit

2023-10-24 Thread Richard Sandiford
RTL-SSA mostly relies on DF for block-level register liveness information, including artificial uses and defs at the beginning and end of blocks. But one case was missing. DF does not add artificial uses of global registers to the beginning or end of a block. Instead it marks them as used within

[PATCH 0/6] rtl-ssa: Various fixes needed for the late-combine pass

2023-10-24 Thread Richard Sandiford
Testing the late-combine pass showed a depressing number of bugs in areas of RTL-SSA that hadn't been used much until now. Most of them relate to doing things after RA. Tested on aarch64-linux-gnu & x86_64-linux-gnu. OK to install? Richard Richard Sandiford (6): rtl-ssa: Ensure global registe

Re: [PATCH] testsuite: Fix gcc.target/arm/mve/mve_vadcq_vsbcq_fpscr_overwrite.c

2023-10-24 Thread Christophe Lyon
Ping? Le lun. 2 oct. 2023, 10:23, Christophe Lyon a écrit : > ping? maybe this counts as obvious? > > > On Thu, 14 Sept 2023 at 11:13, Christophe Lyon > wrote: > >> ping? >> >> On Fri, 8 Sept 2023 at 10:43, Christophe Lyon >> wrote: >> >>> The test was declaring 'int *carry;' and wrote to '*ca

Re: [PATCH 1/2] testsuite: Add and use thread_fence effective-target

2023-10-24 Thread Christophe Lyon
Ping? Le lun. 2 oct. 2023, 10:24, Christophe Lyon a écrit : > ping? > > On Sun, 10 Sept 2023 at 21:31, Christophe Lyon > wrote: > >> Some targets like arm-eabi with newlib and default settings rely on >> __sync_synchronize() to ensure synchronization. Newlib does not >> implement it by default

[PATCH 2/4] rtl-ssa: Fix handling of deleted insns

2023-10-24 Thread Richard Sandiford
RTL-SSA queues up some invasive changes for later. But sometimes the insns involved in those changes can be deleted by later optimisations, making the queued change unnecessary. This patch checks for that case. gcc/ * rtl-ssa/changes.cc (function_info::perform_pending_updates): Check

[PATCH 3/4] rtl-ssa: Don't insert after insns that can throw

2023-10-24 Thread Richard Sandiford
rtl_ssa::can_insert_after didn't handle insns that can throw. Fixing that avoids a regression with a later patch. gcc/ * rtl-ssa.h: Include cfgbuild.h. * rtl-ssa/movement.h (can_insert_after): Replace is_jump with the more comprehensive control_flow_insn_p. --- gcc/rtl-ssa

[PATCH 4/4] rtl-ssa: Avoid creating duplicated phis

2023-10-24 Thread Richard Sandiford
If make_uses_available was called twice for the same use, we could end up trying to create duplicate definitions for the same extended live range. gcc/ * rtl-ssa/blocks.cc (function_info::create_degenerate_phi): Check whether the requested phi already exists. --- gcc/rtl-ssa/block

[PATCH 1/4] rtl-ssa: Fix null deref in first_any_insn_use

2023-10-24 Thread Richard Sandiford
first_any_insn_use implicitly (but contrary to its documentation) assumed that there was at least one use. gcc/ * rtl-ssa/member-fns.inl (first_any_insn_use): Handle null m_first_use. --- gcc/rtl-ssa/member-fns.inl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git

[PATCH 0/4] rtl-ssa: Some small, obvious fixes

2023-10-24 Thread Richard Sandiford
This series contains some small fixes to RTL-SSA. Tested on aarch64-linux-gnu & x86_64-linux-gnu, pushed as obvious. Richard Sandiford (4): rtl-ssa: Fix null deref in first_any_insn_use rtl-ssa: Fix handling of deleted insns rtl-ssa: Don't insert after insns that can throw rtl-ssa: Avoid

[PING] [PATCH 1/3] [GCC] arm: vld1q_types_x2 ACLE intrinsics

2023-10-24 Thread Ezra Sitorus
Ping From: ezra.sito...@arm.com Sent: Friday, October 6, 2023 10:49 AM To: gcc-patches@gcc.gnu.org Cc: Richard Earnshaw; Kyrylo Tkachov Subject: [PATCH 1/3] [GCC] arm: vld1q_types_x2 ACLE intrinsics From: Ezra Sitorus This patch is part of a series of p

Re: [PATCH] i386: Avoid paradoxical subreg dests in vector zero_extend

2023-10-24 Thread Uros Bizjak
On Tue, Oct 24, 2023 at 12:08 PM Richard Sandiford wrote: > > For the V2HI -> V2SI zero extension in: > > typedef unsigned short v2hi __attribute__((vector_size(4))); > typedef unsigned int v2si __attribute__((vector_size(8))); > v2si f (v2hi x) { return (v2si) {x[0], x[1]}; } > > ix86_expan

[PATCH] recog: Fix propagation into ASM_OPERANDS

2023-10-24 Thread Richard Sandiford
An inline asm with multiple output operands is represented as a parallel set in which the SET_SRCs are the same (shared) ASM_OPERANDS. insn_propgation didn't account for this, and instead propagated into each ASM_OPERANDS individually. This meant that it could apply a substitution X->Y to Y itself

[PATCH] recog/reload: Remove old UNARY_P operand support

2023-10-24 Thread Richard Sandiford
reload and constrain_operands had some old code to look through unary operators. E.g. an operand could be (sign_extend (reg X)), and the constraints would match the reg rather than the sign_extend. This was previously used by the MIPS port. But relying on it was a recurring source of problems, s

[PATCH] i386: Fix undefined masks in vpopcnt tests

2023-10-24 Thread Richard Sandiford
The files changed in this patch had tests for masked and unmasked popcnt. However, the mask inputs to the masked forms were undefined, and would be set to zero by init_regs. Any combine-like pass that ran after init_regs could then fold the masked forms into the unmasked ones. I saw this while t

  1   2   >