[PATCH 3/6] rtl-ssa: Fix ICE when deleting memory clobbers

2023-10-24 Thread Richard Sandiford
Sometimes an optimisation can remove a clobber of scratch registers or scratch memory. We then need to update the DU chains to reflect the removed clobber. For registers this isn't a problem. Clobbers of registers are just momentary blips in the register's lifetime. They act as a barrier for mo

[PATCH 6/6] rtl-ssa: Handle call clobbers in more places

2023-10-24 Thread Richard Sandiford
In order to save (a lot of) memory, RTL-SSA avoids creating individual clobber records for every call-clobbered register. It instead maintains a list & splay tree of calls in an EBB, grouped by ABI. This patch takes these call clobbers into account in a couple more routines. I don't think this wi

[PATCH 4/6] rtl-ssa: Handle artifical uses of deleted defs

2023-10-24 Thread Richard Sandiford
If an optimisation removes the last real use of a definition, there can still be artificial uses left. This patch removes those uses too. These artificial uses exist because RTL-SSA is only an SSA-like view of the existing RTL IL, rather than a native SSA representation. It effectively treats RTL

[PATCH 2/6] rtl-ssa: Create REG_UNUSED notes after all pending changes

2023-10-24 Thread Richard Sandiford
Unlike REG_DEAD notes, REG_UNUSED notes need to be kept free of false positives by all passes. function_info::change_insns does this by removing all REG_UNUSED notes, and then using add_reg_unused_notes to add notes back (or create new ones) where appropriate. The problem was that it called add_r

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-24 Thread Richard Sandiford
Richard Biener writes: > On Thu, 19 Oct 2023, Robin Dapp wrote: > >> Ugh, I didn't push yet because with a rebased trunk I am >> seeing different behavior for some riscv testcases. >> >> A reduction is not recognized because there is yet another >> "double use" occurrence in check_reduction_path.

Re: [PING][PATCH 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2023-10-24 Thread Richard Sandiford
Sorry for the slow review. I had a look at the arm bits too, to get some context for the target-independent bits. Stamatis Markianos-Wright via Gcc-patches writes: > [...] > diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h > index 77e76336e94..74186930f0b 100644 > --- a/gcc

[PATCH 0/3] rtl-ssa: Various extensions for the late-combine pass

2023-10-24 Thread Richard Sandiford
This series adds some RTL-SSA enhancements that are needed by the late-combine pass. Tested on aarch64-linux-gnu & x86_64-linux-gnu. OK to install? Richard Richard Sandiford (3): rtl-ssa: Use frequency-weighted insn costs rtl-ssa: Extend make_uses_available rtl-ssa: Add new he

[PATCH 1/3] rtl-ssa: Use frequency-weighted insn costs

2023-10-24 Thread Richard Sandiford
rtl_ssa::changes_are_worthwhile used the standard approach of summing up the individual costs of the old and new sequences to see which one is better overall. But when optimising for speed and changing instructions in multiple blocks, it seems better to weight the cost of each instruction by its e

[PATCH 2/3] rtl-ssa: Extend make_uses_available

2023-10-24 Thread Richard Sandiford
The first in-tree use of RTL-SSA was fwprop, and one of the goals was to make the fwprop rewrite preserve the old behaviour as far as possible. The switch to RTL-SSA was supposed to be a pure infrastructure change. So RTL-SSA has various FIXMEs for things that were artifically limited to faciliat

[PATCH 3/3] rtl-ssa: Add new helper functions

2023-10-24 Thread Richard Sandiford
This patch adds some RTL-SSA helper functions. They will be used by the upcoming late-combine pass. The patch contains the first non-template out-of-line function declared in movement.h, so it adds a movement.cc. I realise it seems a bit over-the-top to have a file with just one function, but it

[PATCH] Add a late-combine pass [PR106594]

2023-10-24 Thread Richard Sandiford
This patch adds a combine pass that runs late in the pipeline. There are two instances: one between combine and split1, and one after postreload. The pass currently has a single objective: remove definitions by substituting into all uses. The pre-RA version tries to restrict itself to cases that

Re: PR111754

2023-10-24 Thread Richard Sandiford
Hi, Sorry the slow review. I clearly didn't think this through properly when doing the review of the original patch, so I wanted to spend some time working on the code to get a better understanding of the problem. Prathamesh Kulkarni writes: > Hi, > For the following test-case: > > typedef floa

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-24 Thread Richard Sandiford
Robin Dapp writes: > The attached patch introduces a VCOND_MASK_LEN, helps for the riscv cases > that were broken before and looks unchanged on x86, aarch64 and power > bootstrap and testsuites. > > I only went with the minimal number of new match.pd patterns and did not > try stripping the length

Re: [PATCH-1v4, expand] Enable vector mode for compare_by_pieces [PR111449]

2023-10-25 Thread Richard Sandiford
HAO CHEN GUI writes: > Hi Haochen, > The regression cases are caused by "targetm.scalar_mode_supported_p" I added > for scalar mode checking. XImode, OImode and TImode (with -m32) are not > enabled in ix86_scalar_mode_supported_p. So they're excluded from by pieces > operations on i386. > > Th

Re: PR111754

2023-10-25 Thread Richard Sandiford
Sigh, I knew I should have waited until the morning to proof-read and send this. Richard Sandiford writes: > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc > index 40767736389..00fce4945a7 100644 > --- a/gcc/fold-const.cc > +++ b/gcc/fold-const.cc > @@ -107

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-25 Thread Richard Sandiford
Robin Dapp writes: >> At first, this seemed like an odd place to fold away the length. >> AFAIK the length in res_op is inherited directly from the original >> operation, and so it isn't any more redundant after the fold than >> it was before. But I suppose the reason for doing it here is that >>

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-25 Thread Richard Sandiford
钟居哲 writes: >>> Which one is right? > Hi, Richard. Let me explain this situation. > > Both situations are possible. It's depending on the 'ELSE' value whether it > is unitialized value. > > For reduction case: > > for (int i = 0; i < n; i++) > result += a[i] > > The trailing elements should be

Re: PR111754

2023-10-25 Thread Richard Sandiford
Prathamesh Kulkarni writes: > On Wed, 25 Oct 2023 at 02:58, Richard Sandiford > wrote: >> >> Hi, >> >> Sorry the slow review. I clearly didn't think this through properly >> when doing the review of the original patch, so I wanted to spend >>

Re: [PATCH] DOC: Update COND_LEN document

2023-10-26 Thread Richard Sandiford
Juzhe-Zhong writes: > As Richard's suggested. We need to adapt doc for cond_len operations. > > gcc/ChangeLog: > > * doc/md.texi: Update document. Thanks for addressing my comment. I was thinking about the pseudo code though. Currently it is: for (i = 0; i < ops[4] + ops[5]; i++) op0[i

Re: [PATCH V2] DOC: Update COND_LEN document

2023-10-26 Thread Richard Sandiford
Juzhe-Zhong writes: > gcc/ChangeLog: > > * doc/md.texi: Adapt COND_LEN pseudo code. OK. Given your earlier message, I'd justed finished writing & testing the same patch. Richard > --- > gcc/doc/md.texi | 18 -- > 1 file changed, 12 insertions(+), 6 deletions(-) > > diff

[PATCH] testsuite: Allow general skips/requires in PCH tests

2023-10-26 Thread Richard Sandiford
dg-pch.exp handled dg-require-effective-target pch_supported_debug as a special case, by grepping the source code. This patch tries to generalise it to other dg-require-effective-targets, and to dg-skip-if. There also seemed to be some errors in check-flags. It used: lappend $args [list ]

Re: [PATCH V2 2/7] aarch64: Add support for aarch64-sys-regs.def

2023-10-26 Thread Richard Sandiford
Thanks for the updates. Victor Do Nascimento writes: > On 10/18/23 22:07, Richard Sandiford wrote: >> Victor Do Nascimento writes: >>> This patch defines the structure of a new .def file used for >>> representing the aarch64 system registers, what information it sh

Re: [PATCH V2 5/7] aarch64: Implement system register r/w arm ACLE intrinsic functions

2023-10-26 Thread Richard Sandiford
Victor Do Nascimento writes: > On 10/18/23 21:39, Richard Sandiford wrote: >> Victor Do Nascimento writes: >>> Implement the aarch64 intrinsics for reading and writing system >>> registers with the following signatures: >>> >>> ui

Re: [PATCH V2 7/7] aarch64: Add system register duplication check selftest

2023-10-26 Thread Richard Sandiford
Victor Do Nascimento writes: > On 10/18/23 22:30, Richard Sandiford wrote: >> Victor Do Nascimento writes: >>> Add a build-time test to check whether system register data, as >>> imported from `aarch64-sys-reg.def' has any duplicate entries. >>> >

Re: [PATCH] aarch64: Add basic target_print_operand support for CONST_STRING

2023-10-26 Thread Richard Sandiford
Victor Do Nascimento writes: > Motivated by the need to print system register names in output > assembly, this patch adds the required logic to > `aarch64_print_operand' to accept rtxs of type CONST_STRING and > process these accordingly. > > Consequently, an rtx such as: > > (set (reg/i:DI 0 x0

Re: [PATCH v2] VECT: Remove the type size restriction of vectorizer

2023-10-26 Thread Richard Sandiford
her words, I don't think simply removing the test from the vectoriser is correct. It needs to be replaced by something more selective. Thanks, Richard >> Pan >> >> -Original Message- >> From: Richard Biener >> Sent: Thursday, October 26, 2023 4:38 PM >&g

Re: [1/3] Add support for target_version attribute

2023-10-26 Thread Richard Sandiford
Andrew Carlotti writes: > This patch adds support for the "target_version" attribute to the middle > end and the C++ frontend, which will be used to implement function > multiversioning in the aarch64 backend. > > Note that C++ is currently the only frontend which supports > multiversioning using

Re: [PATCH] testsuite, aarch64: Normalise options to aarch64.exp.

2023-10-26 Thread Richard Sandiford
Iain Sandoe writes: > tested on cfarm185 (aarch64-linux-gnu, xgene1) and with the aarch64 > Darwin prototype. It is possible that some initial fallout could occur > on some test setups (where the default has been catered for in some > way) - but that should stabilize. OK for trunk? > thanks > Ia

Re: [PATCH] testsuite, Darwin: Add support for Mach-O function body scans.

2023-10-26 Thread Richard Sandiford
Iain Sandoe writes: > This was written before Thomas' modification to the ELF-handling to allow > a config-based change for target details. I did consider updating this > to try and use that scheme, but I think that it would sit a little > awkwardly, since there are some differences in the start-

Re: [PATCH, expand] Checking available optabs for scalar modes in by pieces operations

2023-10-27 Thread Richard Sandiford
HAO CHEN GUI writes: > Hi, > This patch checks available optabs for scalar modes used in by > pieces operations. It fixes the regression cases caused by previous > patch. Now both scalar and vector modes are examined by the same > approach. > > Bootstrapped and tested on x86 and powerpc64-linu

Re: [PATCH] recog: Fix propagation into ASM_OPERANDS

2023-10-27 Thread Richard Sandiford
Jeff Law writes: > On 10/24/23 04:15, Richard Sandiford wrote: >> An inline asm with multiple output operands is represented as a >> parallel set in which the SET_SRCs are the same (shared) ASM_OPERANDS. >> insn_propgation didn't account for this, and instead propagated

Re: [2/3] [aarch64] Add function multiversioning support

2023-10-30 Thread Richard Sandiford
Andrew Carlotti writes: > This adds initial support for function multiversion on aarch64 using the > target_version and target_clones attributes. This mostly follows the > Beta specification in the ACLE [1], with a few diffences that remain to > be fixed: > > - Symbol mangling for target_clones di

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-31 Thread Richard Sandiford
Robin Dapp writes: > Changed as suggested. The difference to v5 is thus: > > + if (cond_fn_p) > + { > + gcall *call = dyn_cast (use_stmt); > + unsigned else_pos > + = internal_fn_else_index (internal_fn (op.code)); > + > + for (unsigned int

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-11-02 Thread Richard Sandiford
Robin Dapp writes: >> Looks reasonable overall. The new match patterns are 1:1 the >> same as the COND_ ones. That's a bit awkward, but I don't see >> a good way to "macroize" stuff further there. Can you at least >> interleave the COND_LEN_* ones with the other ones instead of >> putting them

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-11-03 Thread Richard Sandiford
Robin Dapp writes: >> Could you explain why a special expansion is needed? (Sorry if you already >> have and I missed it, bit overloaded ATM.) What does it do that is >> different from what expand_fn_using_insn would do? > > All it does (in excess) is shuffle the arguments - vcond_mask_len has t

[pushed] aarch64: Remove unnecessary can_create_pseudo_p condition

2023-11-03 Thread Richard Sandiford
This patch removes a can_create_pseudo_p condition from *cmov_uxtw_insn_insv, bringing it in line with *cmov_insn_insv. The constraints correctly describe the requirements. Tested on aarch64-linux-gnu & pushed. Richard gcc/ * config/aarch64/aarch64.md (*cmov_uxtw_insn_insv): Remove

[PATCH] aarch64: Rework aarch64_modes_tieable_p [PR112105]

2023-11-03 Thread Richard Sandiford
On AArch64, can_change_mode_class and modes_tieable_p are mostly answering the same questions: (a) Do two modes have the same layout for the bytes that are common to both modes? (b) Do all valid subregs involving the two modes behave as GCC would expect? (c) Is there at least one registe

[pushed] read-rtl: Fix infinite loop while parsing [...]

2023-11-05 Thread Richard Sandiford
read_rtx_operand would spin endlessly for: (unspec [(...))] UNSPEC_FOO) because read_nested_rtx does nothing if the next character is not '('. Pushed after testing on aarch64-linux-gnu & x86_&4-linux-gnu. Richard gcc/ * read-rtl.cc (read_rtx_operand): Avoid spinning endlessly for

[pushed] mode-switching: Remove unused bbnum field

2023-11-05 Thread Richard Sandiford
seginfo had an unused bbnum field, presumably dating from before BB information was attached directly to insns. Pushed as obvious after testing on aarch64-linux-gnu & x86_64-linux-gnu. Richard gcc/ * mode-switching.cc: Remove unused forward references. (seginfo): Remove bbnum.

[PATCH] explow: Allow dynamic allocations after vregs

2023-11-05 Thread Richard Sandiford
This patch allows allocate_dynamic_stack_space to be called before or after virtual registers have been instantiated. It uses the same approach as allocate_stack_local, which already supported this. Tested on aarch64-linux-gnu & x86_64-linux-gnu. OK to install? Richard gcc/ * function

[PATCH] explow: Avoid unnecessary alignment operations

2023-11-05 Thread Richard Sandiford
align_dynamic_address would output alignment operations even for a required alignment of 1 byte. Tested on aarch64-linux-gnu & x86_64-linux-gnu. OK to install? Richard gcc/ * explow.cc (align_dynamic_address): Do nothing if the required alignment is a byte. --- gcc/explow.cc |

[PATCH 00/12] Tweaks and extensions to the mode-switching pass

2023-11-05 Thread Richard Sandiford
triplet per other target that uses mode switching. OK to install? Thanks, Richard Richard Sandiford (12): mode-switching: Tweak the macro/hook documentation mode-switching: Add note problem mode-switching: Avoid quadractic list operation mode-switching: Fix the mode passed to the em

[PATCH 01/12] mode-switching: Tweak the macro/hook documentation

2023-11-05 Thread Richard Sandiford
I found the documentation for the mode-switching macros/hooks a bit hard to follow at first. This patch tries to add the information that I think would have made it easier to understand. Of course, documentation preferences are personal, and so I could be changing something that others understood

[PATCH 02/12] mode-switching: Add note problem

2023-11-05 Thread Richard Sandiford
optimize_mode_switching uses REG_DEAD notes to track register liveness, but it failed to tell DF to calculate up-to-date notes. Noticed by inspection. I don't have a testcase that fails because of this. gcc/ * mode-switching.cc (optimize_mode_switching): Call df_note_add_problem.

[PATCH 03/12] mode-switching: Avoid quadractic list operation

2023-11-05 Thread Richard Sandiford
add_seginfo chained insn information to the end of a list by starting at the head of the list. This patch avoids the quadraticness by keeping track of the tail pointer. gcc/ * mode-switching.cc (add_seginfo): Replace head pointer with a pointer to the tail pointer. (optimi

[PATCH 04/12] mode-switching: Fix the mode passed to the emit hook

2023-11-05 Thread Richard Sandiford
optimize_mode_switching passes an entity's current mode (if known) to the emit hook. However, the mode that it passed ignored the effect of the after hook. Instead, the mode for the first emit call in a block was taken from the incoming mode, whereas the mode for each subsequent emit call was tak

[PATCH 05/12] mode-switching: Simplify recording of transparency

2023-11-05 Thread Richard Sandiford
For a given block, an entity is either transparent for all modes or for none. Each update to the transparency set therefore used a loop like: for (i = 0; i < no_mode; i++) clear_mode_bit (transp[bb->index], j, i); This patch instead starts out with a bit-per-blo

[PATCH 06/12] mode-switching: Tweak entry/exit handling

2023-11-05 Thread Richard Sandiford
An entity isn't transparent in a block that requires a specific mode. optimize_mode_switching took that into account for normal insns, but didn't for the exit block. Later patches misbehaved because of this. In contrast, an entity was correctly marked as non-transparent in the entry block, but th

[PATCH 07/12] mode-switching: Allow targets to set the mode for EH handlers

2023-11-05 Thread Richard Sandiford
The mode-switching pass already had hooks to say what mode an entity is in on entry to a function and what mode it must be in on return. For SME, we also want to say what mode an entity is guaranteed to be in on entry to an exception handler. gcc/ * target.def (mode_switching.eh_handler):

[PATCH 08/12] mode-switching: Pass set of live registers to the needed hook

2023-11-05 Thread Richard Sandiford
The emit hook already takes the set of live hard registers as input. This patch passes it to the needed hook too. SME uses this to optimise the mode choice based on whether state is live or dead. The main caller already had access to the required info, but the special handling of return values di

[PATCH 09/12] mode-switching: Pass the set of live registers to the after hook

2023-11-05 Thread Richard Sandiford
This patch passes the set of live hard registers to the after hook, like the previous one did for the needed hook. gcc/ * target.def (mode_switching.after): Add a regs_live parameter. * doc/tm.texi: Regenerate. * config/epiphany/epiphany-protos.h (epiphany_mode_after): Upda

[PATCH 10/12] mode-switching: Use 1-based edge aux fields

2023-11-05 Thread Richard Sandiford
The pass used the edge aux field to record which mode change should happen on the edge, with -1 meaning "none". It's more convenient for later patches to leave aux zero for "none", and use numbers based at 1 to record a change. gcc/ * mode-switching.cc (commit_mode_sets): Use 1-based edge

[PATCH 11/12] mode-switching: Add a target-configurable confluence operator

2023-11-05 Thread Richard Sandiford
The mode-switching pass assumed that all of an entity's modes were mutually exclusive. However, the upcoming SME changes have an entity with some overlapping modes, so that there is sometimes a "superunion" mode that contains two given modes. We can use this relationship to pass something more hel

[PATCH 12/12] mode-switching: Add a backprop hook

2023-11-05 Thread Richard Sandiford
This patch adds a way for targets to ask that selected mode changes be brought forward, through a combination of: (1) requiring a mode in blocks where the entity was previously transparent (2) pushing the transition at the head of a block onto incomging edges SME has two uses for this: - A

Re: [1/3] Add support for target_version attribute

2023-11-05 Thread Richard Sandiford
Andrew Carlotti writes: > On Thu, Oct 26, 2023 at 07:41:09PM +0100, Richard Sandiford wrote: >> Andrew Carlotti writes: >> > This patch adds support for the "target_version" attribute to the middle >> > end and the C++ frontend, which will be used to imple

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-11-05 Thread Richard Sandiford
Robin Dapp writes: >> Ah, OK. IMO it's better to keep the optab operands the same as the IFN >> operands, even if that makes things inconsistent with vcond_mask. >> vcond_mask isn't really a good example to follow, since the operand >> order is not only inconsistent with the IFN, it's also incons

Re: [PATCH] testsuite, Darwin: Add support for Mach-O function body scans.

2023-11-05 Thread Richard Sandiford
Iain Sandoe writes: > Hi Richard, > >> On 26 Oct 2023, at 21:00, Iain Sandoe wrote: > >>> On 26 Oct 2023, at 20:49, Richard Sandiford > wrote: >>> >>> Iain Sandoe writes: >>>> This was written before Thomas' modification to t

Re: [PATCH] testsuite, Darwin: Add support for Mach-O function body scans.

2023-11-06 Thread Richard Sandiford
Iain Sandoe writes: > Hi Richard, > >> On 5 Nov 2023, at 12:11, Richard Sandiford wrote: >> >> Iain Sandoe writes: > >>>> On 26 Oct 2023, at 21:00, Iain Sandoe wrote: >>> >>>>> On 26 Oct 2023, at 20:49, Richard Sandiford >&

Re: [PING][PATCH 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2023-11-06 Thread Richard Sandiford
Stamatis Markianos-Wright writes: >> One of the main reasons for reading the arm bits was to try to answer >> the question: if we switch to a downcounting loop with a GE condition, >> how do we make sure that the start value is not a large unsigned >> number that is interpreted as negative by GE?

[PATCH 1/3] attribs: Cache the gnu namespace

2023-11-06 Thread Richard Sandiford
Later patches add more calls to get_attribute_namespace. For scoped attributes, this is a simple operation on tree pointers. But for normal GNU attributes (the vast majority), it involves a call to get_identifier ("gnu"). This patch caches the identifier for speed. Admittedly I'm just going off g

[PATCH 2/3] attribs: Consider namespaces when comparing attributes

2023-11-06 Thread Richard Sandiford
decl_attributes and comp_type_attributes both had code that iterated over one list of attributes and looked for coresponding attributes in another list. This patch makes those lookups namespace-aware. Tested on aarch64-linux-gnu & x86_64-linux-gnu. OK to install? Richard gcc/ * attrib

[PATCH 3/3] attribs: Namespace-aware lookup_attribute_spec

2023-11-06 Thread Richard Sandiford
attribute_ignored_p already used a namespace-aware query to find the attribute_spec for an existing attribute: const attribute_spec *as = lookup_attribute_spec (TREE_PURPOSE (attr)); This patch does the same for other callers in the file. Tested on aarch64-linux-gnu & x86_64-linux-gnu. OK

Ping: [PATCH] Allow target attributes in non-gnu namespaces

2023-11-06 Thread Richard Sandiford
This is a ping+rebase of the patch below. I've also optimised the handling of ignored attributes so that we don't register empty tables. There was also a typo in the jit changes (which I had tested, but the typo didn't seem to cause a failure). Retested on aarch64-linux-gnu & x86_64-linux-gnu. T

Re: [PATCH] attribs: Use existing traits for excl_hash_traits

2023-11-06 Thread Richard Sandiford
Ping. Richard Sandiford via Gcc-patches writes: > excl_hash_traits can be defined more simply by reusing existing traits. > > Tested on aarch64-linux-gnu. OK to install? > > Richard > > > gcc/ > * attribs.cc (excl_hash_traits): Delete. > (test_attrib

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-11 Thread Richard Sandiford
Qing Zhao writes: >> On Sep 11, 2020, at 12:32 PM, Richard Sandiford >> >> If we go for (2), then I think it would be better to do >> it at the start of pass_late_compilation instead. (Some targets wouldn't >> cope with doing it later.) The reason for doi

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-11 Thread Richard Sandiford
Qing Zhao writes: >> On Sep 11, 2020, at 4:44 PM, Richard Sandiford >> wrote: >> >> Qing Zhao writes: >>>> On Sep 11, 2020, at 12:32 PM, Richard Sandiford >>>> >> If we go for (2), then I think it would be >>>> better t

Re: [PATCH v2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-14 Thread Richard Sandiford
Richard Biener via Gcc-patches writes: > On gimple the above function (after fixing it) looks like > > VIEW_CONVERT_EXPR(u)[_1] = i_4(D); > > and the IFN idea I had would - for non-global memory 'u' only - transform > this to > > vector_register_2 = u; > vector_register_3 = .IFN_VEC_SET (vec

Re: [PATCH] rtlanal: fix subreg handling in set_noop_p ()

2020-09-14 Thread Richard Sandiford
Ilya Leoshkevich writes: > Bootstrapped and regtested on x86_64-redhat-linux. Ok for master? > > > > The following s390 rtx is errneously considered a no-op: > > (set (subreg:DF (reg:TF %f0) 8) (subreg:DF (reg:V1TF %f0) 8)) > > Here, SET_DEST is a second register in a floating-point register pair

Re: [PATCH v2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-14 Thread Richard Sandiford
Richard Biener writes: > On Mon, Sep 14, 2020 at 12:47 PM Richard Sandiford > wrote: >> >> Richard Biener via Gcc-patches writes: >> > On gimple the above function (after fixing it) looks like >> > >> > VIEW_CONVERT_EXPR(u)[_1] = i_4(D); >> &

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-14 Thread Richard Sandiford
Qing Zhao writes: >> Like I mentioned earlier though, passes that run after >> pass_thread_prologue_and_epilogue can use call-clobbered registers that >> weren't previously used. For example, on x86_64, the function might >> not use %r8 when the prologue, epilogue and returns are generated, >> bu

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-14 Thread Richard Sandiford
Qing Zhao writes: >> On Sep 14, 2020, at 11:33 AM, Richard Sandiford >> wrote: >> >> Qing Zhao writes: >>>> Like I mentioned earlier though, passes that run after >>>> pass_thread_prologue_and_epilogue can use call-clobbered registers that >&

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-15 Thread Richard Sandiford
Qing Zhao writes: >> On Sep 14, 2020, at 2:20 PM, Richard Sandiford >> wrote: >> >> Qing Zhao mailto:qing.z...@oracle.com>> writes: >>>> On Sep 14, 2020, at 11:33 AM, Richard Sandiford >>>> wrote: >>>> >>>> Qi

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-15 Thread Richard Sandiford
Segher Boessenkool writes: > On Mon, Sep 14, 2020 at 05:33:33PM +0100, Richard Sandiford wrote: >> > However, for the cases on Power as Segher mentioned, there are also some >> > scratch registers used for >> > Other purpose, not sure whether we can correctly genera

Re: [PATCH] LRA: Make fixed eliminable registers live

2020-09-15 Thread Richard Sandiford
Thanks for looking at this. "H.J. Lu" writes: > commit 1bcb4c4faa4bd6b1c917c75b100d618faf9e628c > Author: Richard Sandiford > Date: Wed Oct 2 07:37:10 2019 + > > [LRA] Don't make eliminable registers live (PR91957) > > didn't make eliminable

Re: [PATCH] arm: Add new vector mode macros

2020-09-16 Thread Richard Sandiford
Ping Richard Sandiford writes: > [ This is related to Dennis's subtraction patch > https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553339.html > and the discussion about how the patterns were written. I wanted > to see whether there was a way that we could sim

Re: [PATCH V2] vec: don't select partial vectors when looping on full vectors

2020-09-16 Thread Richard Sandiford
Andrea Corallo writes: > Hi all, > > here is the update version of the patch implementing suggestions. > > The check for 'vect_need_peeling_or_partial_vectors_p' (and its > comment) has also been move just before so we can short-circuit the > partial vector handling if we know we are using full ve

Re: [PATCH] Ignore the clobbered stack pointer in asm statment

2020-09-16 Thread Richard Sandiford
Jakub Jelinek via Gcc-patches writes: > On Mon, Sep 14, 2020 at 08:57:18AM -0700, H.J. Lu via Gcc-patches wrote: >> Something like this for GCC 8 and 9. > > Guess my preference would be to do this everywhere and then let's discuss if > we change the warning into error there or keep it being deprec

Re: [PATCH] aarch64: Fix ICE on fpsr fpcr getters [PR96968]

2020-09-16 Thread Richard Sandiford
Andrea Corallo writes: > @@ -2034,6 +2034,16 @@ aarch64_expand_fpsr_fpcr_setter (int unspec, > machine_mode mode, tree exp) >emit_insn (gen_aarch64_set (unspec, mode, op)); > } > > +/* Expand a fpsr or fpcr getter (depending on UNSPEC) using MODE. > + Return the target. */ > +static rtx

Re: [PATCH] IRA: Don't make a global register eliminable

2020-09-16 Thread Richard Sandiford
"H.J. Lu" writes: > On Tue, Sep 15, 2020 at 7:44 AM Richard Sandiford > wrote: >> >> Thanks for looking at this. >> >> "H.J. Lu" writes: >> > commit 1bcb4c4faa4bd6b1c917c75b100d618faf9e628c >> > Author: Richard Sandiford >&

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-16 Thread Richard Sandiford
Qing Zhao writes: > Segher and Richard, > > Now there are two major concerns from the discussion so far: > > 1. (From Richard): Inserting zero insns should be done after > pass_thread_prologue_and_epilogue since later passes (for example, > pass_regrename) might introduce new used caller-saved

Re: [PATCH] aarch64: Add extend-as-extract-with-shift pattern [PR96998]

2020-09-17 Thread Richard Sandiford
Alex Coplan writes: > Hi Richard, > > On 10/09/2020 19:18, Richard Sandiford wrote: >> Alex Coplan writes: >> > Hello, >> > >> > Since r11-2903-g6b3034eaba83935d9f6dfb20d2efbdb34b5b00bf introduced a >> > canonicalization from mult to shif

Re: [PATCH] this patch modify an unused variable in aarch64-unwind.h

2020-09-17 Thread Richard Sandiford
Wei Wentao writes: > Hi, > >This patch modify an unused variable in aarch64-unwing.h because the > warning says "unused parameter 'fs'". Thanks, pushed. I also broke the line after the parameter, to keep things within the 80-character limit. RichardR > > Weiwt > regards! > > --- > libgcc

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-17 Thread Richard Sandiford
Qing Zhao writes: >> On Sep 17, 2020, at 1:17 AM, Richard Sandiford >> wrote: >> >> Qing Zhao writes: >>> Segher and Richard, >>> >>> Now there are two major concerns from the discussion so far: >>> >&

Re: [PATCH] aarch64: Fix dejaGNU directive in clastb_8.c testcase

2020-09-17 Thread Richard Sandiford
Andrea Corallo writes: > Hi all, > > this is to fix a typo in a dejaGNU directive introduced with > with 052204fac58 "vec: don't select partial vectors when > unnecessary". > > Okay for trunk? OK, thanks. FWIW, this would also have been OK under the “obviously correct” rule (but asking is obvio

Re: [PATCH V2] aarch64: Fix ICE on fpsr fpcr getters [PR96968]

2020-09-17 Thread Richard Sandiford
Andrea Corallo writes: > Hi all, > > second version of the patch here implementing the suggestion of using > create_output_operand and the expand_insn machinery. > > Regtested and bootsraped on aarch64-linux-gnu. > > Okay for trunk? > > Thanks > > Andrea > > From bb35b56810f908c575fec11435071d1c

Re: [PATCH V2] aarch64: Fix ICE on fpsr fpcr getters [PR96968]

2020-09-17 Thread Richard Sandiford
Richard Sandiford writes: >> @@ -2034,6 +2034,18 @@ aarch64_expand_fpsr_fpcr_setter (int unspec, >> machine_mode mode, tree exp) >>emit_insn (gen_aarch64_set (unspec, mode, op)); >> } >> >> +/* Expand a fpsr or fpcr getter (depending on UNSPEC)

Re: [PATCH] vect/test: Don't check for epilogue loop [PR97075]

2020-09-18 Thread Richard Sandiford
Thanks for looking at this. "Kewen.Lin" writes: > Hi, > > The commit r11-3230 brings a nice improvement to use full > vectors instead of partial vectors when available. But > it caused some vector with length test cases to fail on > Power. > > The failure on gcc.target/powerpc/p9-vec-length-epil

Re: [PATCH] vect/test: Don't check for epilogue loop [PR97075]

2020-09-20 Thread Richard Sandiford
"Kewen.Lin" writes: > Hi Richard, >> "Kewen.Lin" writes: >>> Hi, >>> >>> The commit r11-3230 brings a nice improvement to use full >>> vectors instead of partial vectors when available. But >>> it caused some vector with length test cases to fail on >>> Power. >>> >>> The failure on gcc.target/p

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-21 Thread Richard Sandiford
Qing Zhao writes: > Hi, Richard, > > During my implementation of the new version of the patch. I still feel that > it’s not practical to add a default definition in the middle end to just use > move patterns to zero each selected register. > > The major issues are: > > There are some target spe

Re: [PATCH] vect/test: Don't check for epilogue loop [PR97075]

2020-09-21 Thread Richard Sandiford
Andrea Corallo writes: > Richard Sandiford writes: > [...] >> Andrea, how should we handle this? Is it something you'd have time to >> look at? > > Hi Richard, > > I've not OK, NP. In that case I'll give it a go. > but FWIW your observations

Re: [PATCH] aarch64: Do not alter value on a force_reg returned rtx expanding __jcvt

2020-09-21 Thread Richard Sandiford
Andrea Corallo writes: > Hi all, > > From the `force_reg` description comment I see the returned register > should not be modified, thus IIUC should not be used as a GEN_FCN > target. > > Assuming my interpretation is correct this fix this case inside > `aarch64_general_expand_builtin` while expan

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-21 Thread Richard Sandiford
Qing Zhao writes: > My major concern with the default implementation of the hook is: > > If a target has some special registers that should not be zeroed, and we do > not provide an overridden implementation for this target, then the default > implementation will generate incorrect code for this

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-21 Thread Richard Sandiford
Qing Zhao writes: >> But in cases where there is no underlying concept that can sensibly >> be extracted out, it's OK if targets need to override the default >> to get correct behaviour. > > Then, on the target that the default code is not right, and we haven’t > provide overridden implementation

[PATCH] vect: Fix epilogue loop handling of partial vectors

2020-09-22 Thread Richard Sandiford
Richard Sandiford writes: > I'll try to have a patch ready tomorrow morning European time. Well, I totally failed to hit that deadline. When testing on Power, I saw a couple of extra failures, but I now think they're improvements rather than regressions. See the point about sin

[PATCH] aarch64: Add HF routines to libgcc_s.so

2020-09-22 Thread Richard Sandiford
The libgcc HF support routines were being linked into libgcc_s.so, but weren't being exported. Tested on aarch64-linux-gnu and aarch64_be-elf. Any thoughts? I'll apply Monday next week if there are no objections by then. I guess there's the question whether we should backport this to release bra

Re: [PATCH] aarch64: Add extend-as-extract-with-shift pattern [PR96998]

2020-09-22 Thread Richard Sandiford
Segher Boessenkool writes: > Hi Alex, > > On Tue, Sep 22, 2020 at 08:40:07AM +0100, Alex Coplan wrote: >> On 21/09/2020 18:35, Segher Boessenkool wrote: >> Thanks for doing this testing. The results look good, then: no code size >> changes and no build regressions. > > No *code* changes. I cannot

Re: [PATCH] arm: Add new vector mode macros

2020-09-22 Thread Richard Sandiford
Kyrylo Tkachov writes: > Hi Richard, > >> -Original Message----- >> From: Richard Sandiford >> Sent: 16 September 2020 11:15 >> To: gcc-patches@gcc.gnu.org >> Cc: ni...@redhat.com; Richard Earnshaw ; >> Ramana Radhakrishnan ; Kyrylo >> Tkacho

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-22 Thread Richard Sandiford
Qing Zhao writes: >> On Sep 21, 2020, at 2:22 PM, Qing Zhao via Gcc-patches >> wrote: >> >> >> >>> On Sep 21, 2020, at 2:11 PM, Richard Sandiford >>> wrote: >>> >>> Qing Zhao writes: >>>>> But in cases where

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-22 Thread Richard Sandiford
Qing Zhao writes: >> On Sep 17, 2020, at 11:27 AM, Richard Sandiford >> wrote: >> >> Qing Zhao mailto:qing.z...@oracle.com>> writes: >>>> On Sep 17, 2020, at 1:17 AM, Richard Sandiford >>>> wrote: >>>> >>>> Qing

Re: [PATCH] aarch64: Do not alter force_reg returned rtx expanding pauth builtins

2020-09-23 Thread Richard Sandiford
Andrea Corallo writes: > Hi all, > > having a look for force_reg returned rtx later on modified I've found > this other case in `aarch64_general_expand_builtin` while expanding > pointer authentication builtins. > > Regtested and bootsraped on aarch64-linux-gnu. > > Okay for trunk? > > Andrea >

<    1   2   3   4   5   6   7   8   9   10   >