from:"James Greenhalgh"

Re: [PATCH][AArch64] PR79262: Adjust vector cost

2018-11-09 Thread James Greenhalgh

On Fri, Nov 09, 2018 at 08:14:27AM -0600, Wilco Dijkstra wrote: > PR79262 has been fixed for almost all AArch64 cpus, however the example is > still > vectorized in a few cases, resulting in lower performance. Increase the cost > of > vector-to-scalar moves so it is more similar to the other vec

Re: [PATCH][AArch64] PR79262: Adjust vector cost

2018-11-09 Thread James Greenhalgh

On Mon, Jan 22, 2018 at 09:22:27AM -0600, Richard Biener wrote: > On Mon, Jan 22, 2018 at 4:01 PM, Wilco Dijkstra > wrote: > > PR79262 has been fixed for almost all AArch64 cpus, however the example is > > still > > vectorized in a few cases, resulting in lower performance. Increase the > > co

Re: Patch ping (was Re: [PATCH] Fix aarch64_compare_and_swap* constraints (PR target/87839))

2018-11-21 Thread James Greenhalgh

On Tue, Nov 20, 2018 at 11:04:46AM -0600, Jakub Jelinek wrote: > Hi! > > On Tue, Nov 13, 2018 at 10:28:16AM +0100, Jakub Jelinek wrote: > > 2018-11-13 Jakub Jelinek > > > > PR target/87839 > > * config/aarch64/atomics.md (@aarch64_compare_and_swap): Use > > rIJ constraint for aarch

Re: [PATCH v3] [aarch64] Correct the maximum shift amount for shifted operands.

2018-11-28 Thread James Greenhalgh

On Wed, Nov 28, 2018 at 07:08:02AM -0600, Philipp Tomsich wrote: > > > On 28.11.2018, at 13:10, Richard Earnshaw (lists) > mailto:richard.earns...@arm.com>> wrote: > > On 26/11/2018 19:50, Christoph Muellner wrote: > The aarch64 ISA specification allows a left shift amount to be applied > after

Re: [PATCH][AArch64][3/3] Introduce mla64 type

2018-11-28 Thread James Greenhalgh

On Mon, Nov 26, 2018 at 11:36:47AM -0600, Kyrill Tkachov wrote: > Hi all, > > On some cores the X-register MADD/MSUB (and hence MUL and MNEG) instructions > may behave differently > than the W-register forms and the scheduling models may want to reflect that. > That is currently not possible beca

Re: [PATCH][AArch64][2/3] Correct type attribute for mul and mneg instructions

2018-11-28 Thread James Greenhalgh

On Mon, Nov 26, 2018 at 11:36:43AM -0600, Kyrill Tkachov wrote: > Hi all, > > In the AAarch64 ISA the MUL and MNEG instructions are actually aliases of > MADD and MSUB. > Therefore they should have the type attribute mla, rather than mul, which > should only be used > for AArch32 32-bit multipli

Re: [PATCH][GCC][AARCH64] Replace calls to strtok with strtok_r in aarch64 attribute handling code

2018-11-28 Thread James Greenhalgh

On Fri, Nov 23, 2018 at 08:22:49AM -0600, Sam Tebbs wrote: > Hi all, > > They AArch64 general attribute handling code uses the strtok function to > separate comma-delimited attributes in a string. This causes problems for and > interfers with attribute-specific handling code that also uses strtok

Re: [PATCH 3/9][GCC][AArch64] Add autovectorization support for Complex instructions

2018-11-28 Thread James Greenhalgh

On Mon, Nov 12, 2018 at 06:31:45AM -0600, Tamar Christina wrote: > Hi Kyrill, > > > Hi Tamar, > > > > On 11/11/18 10:26, Tamar Christina wrote: > > > Hi All, > > > > > > This patch adds the expander support for supporting autovectorization of > > > complex number operations > > > such as Complex

Re: [PATCH 4/9][GCC][AArch64/Arm] Add new testsuite directives to check complex instructions.

2018-11-28 Thread James Greenhalgh

On Sun, Nov 11, 2018 at 04:27:04AM -0600, Tamar Christina wrote: > Hi All, > > This patch adds new testsuite directive for both Arm and AArch64 to support > testing of the Complex Arithmetic operations form Armv8.3-a. > > Bootstrap and Regtest on aarch64-none-linux-gnu, arm-none-gnueabihf and >

Re: [PATCH 5/9][GCC][AArch64/Arm] Add auto-vectorization tests.

2018-11-28 Thread James Greenhalgh

On Sun, Nov 11, 2018 at 04:27:33AM -0600, Tamar Christina wrote: > Hi All, > > This patch adds tests for AArch64 and Arm to test the autovectorization > of complex numbers using the Armv8.3-a instructions. > > This patch enables them only for AArch64 at this point. > > Bootstrapped Regtested on

Re: [PATCH][arm/AArch64] Assume unhandled NEON types are neon_arith_basic types when scheduling for Cortex-A5

2019-07-01 Thread James Greenhalgh

On Mon, Jul 01, 2019 at 04:13:40PM +0100, Kyrill Tkachov wrote: > Hi all, > > Some scheduling descriptions, like the Cortex-A57 one, are reused for > multiple -mcpu options. > Sometimes those other -mcpu cores support more architecture features > than the Armv8-A Cortex-A57. > For example, the C

Re: [PATCH][AArch64] Remove constraint strings from define_expand constructs in the back end

2019-07-01 Thread James Greenhalgh

On Mon, Jun 24, 2019 at 04:33:40PM +0100, Dennis Zhang wrote: > Hi, > > A number of AArch64 define_expand patterns have specified constraints > for their operands. But the constraint strings are ignored at expand > time and are therefore redundant/useless. We now avoid specifying > constraints

Re: [PING][AArch64] Use scvtf fbits option where appropriate

2019-07-01 Thread James Greenhalgh

On Wed, Jun 26, 2019 at 10:35:00AM +0100, Joel Hutton wrote: > Ping, plus minor rework (mostly non-functional changes) > > gcc/ChangeLog: > > 2019-06-12 Joel Hutton > > * config/aarch64/aarch64-protos.h (aarch64_fpconst_pow2_recip): New > prototype > * config/aarch64/aarch6

Re: [patch 1/2][aarch64]: redefine aes patterns

2019-07-08 Thread James Greenhalgh

On Fri, Jul 05, 2019 at 12:24:42PM +0100, Sylvia Taylor wrote: > Greetings, > > This first patch removes aarch64 usage of the aese/aesmc and aesd/aesimc > fusions (i.e. aes fusion) implemented in the scheduler due to unpredictable > behaviour observed in cases such as: > - when register allocation

Re: [PATCH][GCC][AArch64] Make processing less fragile in config.gcc

2019-07-08 Thread James Greenhalgh

On Tue, Jun 25, 2019 at 09:30:30AM +0100, Tamar Christina wrote: > Hi All, > > This is an update to the patch rebased to after the SVE2 options have been > merged. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for trunk? OK. Thanks, James > > Thanks, > Tamar >

Re: [patch][aarch64]: add intrinsics for vld1(q)_x4 and vst1(q)_x4

2019-07-18 Thread James Greenhalgh

On Mon, Jun 10, 2019 at 06:21:05PM +0100, Sylvia Taylor wrote: > Greetings, > > This patch adds the intrinsic functions for: > - vld1__x4 > - vst1__x4 > - vld1q__x4 > - vst1q__x4 > > Bootstrapped and tested on aarch64-none-linux-gnu. > > Ok for trunk? If yes, I don't have any commit rights, so c

Re: [patch][aarch64]: add usra and ssra combine patterns

2019-07-22 Thread James Greenhalgh

Regarding aarch64_sra_n, this patch shouldn't affect it. > > I am also not aware of any way of enabling this combine inside the pattern > used for those intrinsics, so I kept them separate. > > Cheers, > Syl > > -Original Message- > From: James Greenhalgh &

Re: [PATCH, GCC, AArch64] Enable Transactional Memory Extension

2019-07-22 Thread James Greenhalgh

On Wed, Jul 10, 2019 at 07:55:42PM +0100, Sudakshina Das wrote: > Hi > > This patch enables the new Transactional Memory Extension announced > recently as part of Arm's new architecture technologies. > We introduce a new optional extension "tme" to enable this. The > following instructions are p

Re: [PATCH][GCC][AArch64] Fix command line options canonicalization version #2. (PR target/88530)

2019-02-21 Thread James Greenhalgh

On Wed, Feb 20, 2019 at 08:00:38AM -0600, Tamar Christina wrote: > Hi All, > > Commandline options on AArch64 don't get canonicalized into the smallest > possible set before output to the assembler. This means that overlapping > feature > sets are emitted with superfluous parts. > > Normally thi

Re: [PATCH][AArch64] Add support for Neoverse N1

2019-02-21 Thread James Greenhalgh

On Thu, Feb 21, 2019 at 11:42:56AM -0600, Kyrill Tkachov wrote: > Hi all, > > This patch adds support for the Neoverse N1 CPU [1]. This was supported > in GCC earlier through the codename Ares, > which it now replaces. -mcpu=ares is still accepted as there's been a > binutils release supporting

Re: [PATCH][AArch64] Add support for Neoverse E1

2019-02-21 Thread James Greenhalgh

On Thu, Feb 21, 2019 at 11:43:08AM -0600, Kyrill Tkachov wrote: > Hi all, > > This patch adds -mcpu and -mtune support for the Neoverse E1 CPU [1]. > The new option is -mcpu=neoverse-e1. > Bootstrapped and tested on aarch64-none-linux-gnu. OK. Thanks, James > [1] > https://community.arm.com/pr

Re: [PATCH 1/2][GCC][AArch64] Update Armv8.4-a's FP16 FML intrinsics

2019-02-21 Thread James Greenhalgh

On Wed, Feb 20, 2019 at 08:00:13AM -0600, Tamar Christina wrote: > Hi All, > > This patch updates the Armv8.4-a FP16 FML intrinsics's suffixes from u32 to > f16 > to be more consistent with the naming convention for intrinsics. > > The specifications for these intrinsics have not been published

Re: [PATCH, GCC, AArch64] Fix a couple of bugs in BTI

2019-02-21 Thread James Greenhalgh

On Thu, Feb 21, 2019 at 06:19:10AM -0600, Sudakshina Das wrote: > Hi > > While doing more testing I found a couple of issues with my BTI patches. > This patch fixes them: > 1) Remove a reference to return address key. The original patch was > written based on a different not yet committed patch

Re: [Patch] [aarch64] PR target/89324 Handle stack pointer for SUBS/ADDS instructions

2019-02-21 Thread James Greenhalgh

On Mon, Feb 18, 2019 at 08:40:12AM -0600, Matthew Malcomson wrote: > Handle stack pointer with SUBS/ADDS instructions. > > In general the stack pointer was not handled for many SUBS/ADDS patterns in > aarch64.md. > Both the "extended register" and "immediate" forms allow the stack pointer to > be

Re: [Patch] [aarch64] PR target/89324 Handle stack pointer for SUBS/ADDS instructions

2019-02-22 Thread James Greenhalgh

On Fri, Feb 22, 2019 at 09:39:59AM -0600, Matthew Malcomson wrote: > Hi James, > > On 22/02/19 00:09, James Greenhalgh wrote: > > On Mon, Feb 18, 2019 at 08:40:12AM -0600, Matthew Malcomson wrote: > >> > >> Additionally, this patch contains two tidy-ups (hap

Re: [PATCH] Improve arm and aarch64 casesi (PR target/70341)

2019-02-27 Thread James Greenhalgh

On Fri, Feb 22, 2019 at 06:20:51PM -0600, Jakub Jelinek wrote: > Hi! > > The testcase in the PR doesn't hoist any memory loads from the large switch > before the switch on aarch64 and arm (unlike e.g. x86), because the > arm/aarch64 casesi patterns don't properly annotate the memory load from the

Re: [PATCH][GCC][AArch64] Have empty HWCAPs string ignored during native feature detection

2019-02-27 Thread James Greenhalgh

On Thu, Feb 07, 2019 at 04:43:24AM -0600, Tamar Christina wrote: > Hi All, > > Since this hasn't been reviewed yet anyway I've updated this patch to also > fix the memory leaks etc. > > -- > > This patch makes the feature detection code for AArch64 GCC not add features > automatically when the

Re: Re : add tsv110 pipeline scheduling

2019-03-14 Thread James Greenhalgh

On Sat, Feb 23, 2019 at 01:28:22PM +, wuyuan (E) wrote: > Hi ,James: > Sorry for not responding to your email in time because of Chinese New Year’s > holiday and urgent work. The three questions you mentioned last email are due > to my misunderstanding of pipeline. > the first question, These

Re: [PATCH, wwwdocs] Mention -march=armv8.5-a and other new command line options for AArch64 and Arm for GCC 9

2019-03-22 Thread James Greenhalgh

On Wed, Mar 20, 2019 at 10:17:41AM +, Sudakshina Das wrote: > Hi Kyrill > > On 12/03/2019 12:03, Kyrill Tkachov wrote: > > Hi Sudi, > > > > On 2/22/19 10:45 AM, Sudakshina Das wrote: > >> Hi > >> > >> This patch documents the addition of the new Armv8.5-A and corresponding > >> extensions in

Re: [Patch, aarch64] PR 89628 - fix register allocation in SIMD functions

2019-03-22 Thread James Greenhalgh

On Mon, Mar 11, 2019 at 04:10:15PM +, Steve Ellcey wrote: > Richard, > > I don't necessarily disagree with anything in your comments and long > term I think that is the right direction, but I wonder if that level of > change is appropriate for GCC Stage 4 which is where we are now. Your > cha

Re: [PATCH/AARCH64] Fix zero_extendsidi2_aarch64 type attribute

2019-03-22 Thread James Greenhalgh

On Sun, Mar 10, 2019 at 06:26:07PM +, Andrew Pinski wrote: > Hi, > "uxtw x0, w1" is an alias for "mov w0, w1" but currently the > back-end marks it as extend type rather than mov_reg. This patch > fixes that. For most schedule models, this does not matter; I am > adding one where mov (both

Re: [Patch, aarch64] PR 89628 - fix register allocation in SIMD functions

2019-03-22 Thread James Greenhalgh

On Fri, Mar 22, 2019 at 05:35:02PM +, James Greenhalgh wrote: > On Mon, Mar 11, 2019 at 04:10:15PM +, Steve Ellcey wrote: > > Richard, > > > > I don't necessarily disagree with anything in your comments and long > > term I think that is the right directio

Re: Re : add tsv110 pipeline scheduling

2019-04-03 Thread James Greenhalgh

OK for trunk. Thank you for your many clarifications. Will you need one of us to apply this to trunk on your behalf? If you would like me to apply your patch, please provide the full ChangeLog with author information, like so: 2019-04-03 James Greenhalgh Second Author

Re: Re : add tsv110 pipeline scheduling

2019-04-08 Thread James Greenhalgh

* config/aarch64/tsv110.md: New file. > > Thanks, > wuyuan > > -邮件原件- > 发件人: James Greenhalgh [mailto:james.greenha...@arm.com] > 发送时间: 2019年

Re: [PATCH, GCC, AARCH64] Add GNU note section with BTI and PAC.

2019-04-18 Thread James Greenhalgh

On Thu, Apr 04, 2019 at 05:01:06PM +0100, Sudakshina Das wrote: > Hi Richard > > On 03/04/2019 11:28, Richard Henderson wrote: > > On 4/3/19 5:19 PM, Sudakshina Das wrote: > >> + /* PT_NOTE header: namesz, descsz, type. > >> + namesz = 4 ("GNU\0") > >> + descsz = 16 (Size of the program p

Re: [PATCH 6/9][GCC][AArch64] Add Armv8.3-a complex intrinsics

2019-01-09 Thread James Greenhalgh

On Fri, Dec 21, 2018 at 11:57:55AM -0600, Tamar Christina wrote: > Hi All, > > This updated patch adds NEON intrinsics and tests for the Armv8.3-a complex > multiplication and add instructions with a rotate along the Argand plane. > > The instructions are documented in the ArmARM[1] and the intri

Re: [PATCH][AArch64] Use Q-reg loads/stores in movmem expansion

2019-01-09 Thread James Greenhalgh

On Fri, Dec 21, 2018 at 06:30:49AM -0600, Kyrill Tkachov wrote: > Hi all, > > Our movmem expansion currently emits TImode loads and stores when copying > 128-bit chunks. > This generates X-register LDP/STP sequences as these are the most preferred > registers for that mode. > > For the purpose

Re: [RFC][AArch64] Add support for system register based stack protector canary access

2019-01-10 Thread James Greenhalgh

On Mon, Dec 03, 2018 at 03:55:36AM -0600, Ramana Radhakrishnan wrote: > For quite sometime the kernel guys, (more specifically Ard) have been > talking about using a system register (sp_el0) and an offset from that > for a canary based access. This patchset adds support for a new set of > command

Re: [PATCH][GCC][AArch64] Fix big-endian neon-intrinsics ICEs

2019-01-16 Thread James Greenhalgh

On Mon, Jan 14, 2019 at 08:01:47AM -0600, Tamar Christina wrote: > Hi All, > > > This patch fixes some ICEs when the fcmla_lane intrinsics are used on > big endian by correcting the lane indices and removing the hardcoded byte > offset from subreg calls and instead use subreg_lowpart_offset. Woo

Re: [PATCH][GCC][AArch64] Rename stack-clash CFA register to avoid clash.

2019-01-16 Thread James Greenhalgh

On Wed, Jan 16, 2019 at 11:03:41AM -0600, Tamar Christina wrote: > Hi All, > > We had multiple patches in flight that required used of scratch registers in > frame layout code. As it happens two of these features picked the same > register > and landed at around the same time. As such there is

Re: [PATCH][AArch64] Initial -mcpu=ares tuning

2019-01-16 Thread James Greenhalgh

On Tue, Jan 15, 2019 at 09:29:46AM -0600, Kyrill Tkachov wrote: > Hi all, > > This patch adds a tuning struct for the Arm Ares CPU and uses it for > -m{cpu,tune}=ares. > The tunings are an initial attempt and may be improved upon in the future, > but they serve > as a decent starting point for G

Re: [PATCH] PR target/85596 Add --with-multilib-list doc for aarch64

2019-01-17 Thread James Greenhalgh

On Mon, Jan 07, 2019 at 09:07:35AM -0600, Christophe Lyon wrote: > Hi, > > This small patch adds a short description of --with-multilib-list for aarch64. > OK? OK. Thanks, James > > Thanks, > > Christophe > 2019-01-07 Christophe Lyon > > PR target/85596 > * doc/install.texi (

Re: [PATCH] Fix arm_neon.h #pragma GCC target syntax (PR target/88734)

2019-01-17 Thread James Greenhalgh

On Thu, Jan 17, 2019 at 07:47:32AM -0600, Jakub Jelinek wrote: > Hi! > > arm_neon.h on both targets contained a couple of spots with invalid > #pragma GCC target syntax. This doesn't result in errors, just warnings and > those warnings are surpressed in system headers, so are visible with > -Wsys

Re: add tsv110 pipeline scheduling

2019-01-17 Thread James Greenhalgh

Cortex-A57 as an example: > (define_insn_reservation > "cortex_a57_neon_load_d" 11 > (and (eq_attr "tune" "cortexa57") >(eq_attr "cortex_a57_neon_type" "neon_load_d")) > "ca57_cx1_issue+ca57_cx2_issue, >ca57

Re: [PATCH][wwwdocs][Arm][AArch64] Update changes with new features and flags.

2019-01-30 Thread James Greenhalgh

On Wed, Jan 23, 2019 at 04:43:02AM -0600, Tamar Christina wrote: > Hi All, > > This patch adds the documentation for Stack clash protection and Armv8.3-a > support to > changes.html for GCC 9. > I have validated the html using the W3C validator. > > Ok for cvs? Almost OK by me. > > Thanks, >

Re: [PATCH][AArch64] Use implementation namespace consistently in arm_neon.h

2019-02-06 Thread James Greenhalgh

On Wed, Feb 06, 2019 at 07:52:42AM -0600, Kyrill Tkachov wrote: > [resending with patch compressed] > > Hi all, > > We're somewhat inconsistent in arm_neon.h when it comes to using the > implementation namespace for local > identifiers. This means things like: > #define hash_abcd 0 > #define has

Re: [PATCH][AArch64] Use neon_dot_q type for 128-bit [US]DOT instructions where appropriate

2019-02-06 Thread James Greenhalgh

On Tue, Feb 05, 2019 at 11:52:10AM -0600, Kyrill Tkachov wrote: > Hi all, > > For the Dot Product instructions we have the scheduling types neon_dot and > neon_dot_q for the 128-bit versions. > It seems that we're only using the former though, not assigning the > neon_dot_q type anywhere. > > T

Re: [PATCH][AArch64] Change representation of SABD in RTL

2019-02-06 Thread James Greenhalgh

On Mon, Feb 04, 2019 at 04:23:32AM -0600, Kyrill Tkachov wrote: > Hi all, > > Richard raised a concern about the RTL we use to represent the AdvSIMD SABD > (vector signed absolute difference) instruction. > We currently represent it as ABS (MINUS op1 op2). > > This isn't exactly what SABD does. A

Re: [Aarch64][SVE] Vectorise sum-of-absolute-differences

2019-02-06 Thread James Greenhalgh

On Mon, Feb 04, 2019 at 07:34:05AM -0600, Alejandro Martinez Vicente wrote: > Hi, > > This patch adds support to vectorize sum of absolute differences (SAD_EXPR) > using SVE. It also uses the new functionality to ensure that the resulting > loop > is masked. Therefore, it depends on > > https://

Re: [PATCH 8/17][ARM] Add VFP FP16 arithmetic instructions.

2016-08-03 Thread James Greenhalgh

On Wed, Aug 03, 2016 at 12:52:42PM +0100, Ramana Radhakrishnan wrote: > On Thu, Jul 28, 2016 at 12:37 PM, Ramana Radhakrishnan > wrote: > > On Mon, Jul 4, 2016 at 3:02 PM, Matthew Wahab > > wrote: > >> On 19/05/16 15:54, Matthew Wahab wrote: > >>> On 18/05/16 16:20, Joseph Myers wrote: > On

Re: [AArch64] Handle HFAs of float16 types properly

2016-08-04 Thread James Greenhalgh

On Tue, Jul 26, 2016 at 02:55:02PM +0100, James Greenhalgh wrote: > > Hi, > > It looks like we've not been handling structures of 16-bit floating-point > data correctly for AArch64. For some reason we end up passing them > packed in to integer registers. That is to say,

Re: [PATCH] Fix wrong code on aarch64 due to paradoxical subreg

2016-08-04 Thread James Greenhalgh

PR rtl-optimization/70903 * gcc.c-torture/execute/pr70903.c: New test. .../gcc/testsuite/gcc.c-torture/execute/pr70903.c:25:1: error: redefinition of 'foo' .../gcc/testsuite/gcc.c-torture/execute/pr70903.c:6:1: note: previous definition of 'foo

Re: [AArch64] Handle HFAs of float16 types properly

2016-08-05 Thread James Greenhalgh

On Fri, Aug 05, 2016 at 11:00:39AM +0100, Yao Qi wrote: > On Tue, Jul 26, 2016 at 2:55 PM, James Greenhalgh > wrote: > > > > OK? As this is an ABI break, I'm not proposing for it to go back to GCC 6, > > though it will apply cleanly there if the maintainers support th

Re: [AArch64] Handle HFAs of float16 types properly

2016-08-05 Thread James Greenhalgh

On Fri, Aug 05, 2016 at 11:15:24AM +0100, James Greenhalgh wrote: > On Fri, Aug 05, 2016 at 11:00:39AM +0100, Yao Qi wrote: > > On Tue, Jul 26, 2016 at 2:55 PM, James Greenhalgh > > wrote: > > > > > > OK? As this is an ABI break, I'm not proposing for it to

Re: backward threading heuristics tweek

2016-08-08 Thread James Greenhalgh

On Sun, Aug 07, 2016 at 10:30:48AM -0700, Andrew Pinski wrote: > On Mon, Jun 6, 2016 at 3:19 AM, Jan Hubicka wrote: > > Hi, > > while looking into profile mismatches introduced by the backward threading > > pass > > I noticed that the heuristics seems quite simplistics. First it should be > > pr

Re: [PATCH AArch64/V3]Add new patterns for vcond_mask and vec_cmp

2016-08-08 Thread James Greenhalgh

On Mon, Aug 01, 2016 at 01:18:53PM +, Bin Cheng wrote: > Hi, > This is the 3rd version patch implementing vcond_mask and vec_cmp patterns on > AArch64. Bootstrap and test along with next patch on AArch64, is it OK? OK, with a couple of comments below, one on an extension and once style nit.

Re: [PATCH AArch64][V3]Rewrite vcond patterns using vcond_mask/vec_cmp, also support missing vect_cond_mixed patterns

2016-08-08 Thread James Greenhalgh

On Mon, Aug 01, 2016 at 01:19:54PM +, Bin Cheng wrote: > Hi, > This is the 3rd version patch implementing vcond patterns on AArch64. It > rewrites vcond patterns using newly introduced vcond_mask and vec_cmp > patterns in previous patch. It also adds missing vect_cond_mixed patterns > for AAr

Re: [PATCH AArch64/V3]Add new patterns for vcond_mask and vec_cmp

2016-08-09 Thread James Greenhalgh

On Mon, Aug 08, 2016 at 12:10:00PM +0100, Bin.Cheng wrote: > On Mon, Aug 8, 2016 at 11:40 AM, James Greenhalgh > wrote: > > On Mon, Aug 01, 2016 at 01:18:53PM +, Bin Cheng wrote: > >> Hi, > >> This is the 3rd version patch implementing vcond_mask and vec_cmp

Re: Implement C _FloatN, _FloatNx types [version 3]

2016-08-09 Thread James Greenhalgh

On Thu, Jul 28, 2016 at 10:43:25PM +, Joseph Myers wrote: > On Tue, 19 Jul 2016, James Greenhalgh wrote: > > > These slightly complicate the description you give above as we now want > > two behaviours. Where the 16-bit floating point extensions are available, > > w

Re: backward threading heuristics tweek

2016-08-12 Thread James Greenhalgh

On Thu, Aug 11, 2016 at 01:35:16PM +0200, Jan Hubicka wrote: > > On Mon, Jun 6, 2016 at 3:19 AM, Jan Hubicka wrote: > > > Hi, > > > while looking into profile mismatches introduced by the backward > > > threading pass > > > I noticed that the heuristics seems quite simplistics. First it should b

Re: [PATCH PR69848]Avoid not insn by inverting comparison code in vcond patterns

2016-08-16 Thread James Greenhalgh

On Wed, Aug 10, 2016 at 04:00:16PM +, Bin Cheng wrote: > Hi, > This is a follow up patch for previous vcond patches. In previous ones, > we rely on combiner to simplify "X = !Y; Z = X ? A : B" into "Z = Y ? B : A". > That works for some cases, but not all of them, for example, case in > PR69

Re: Implement C _FloatN, _FloatNx types [version 5]

2016-08-17 Thread James Greenhalgh

On Fri, Jul 22, 2016 at 09:59:33PM +, Joseph Myers wrote: > Index: gcc/testsuite/gcc.dg/torture/fp-int-convert-float16-timode.c > === > --- gcc/testsuite/gcc.dg/torture/fp-int-convert-float16-timode.c > (nonexistent) > +++ gc

Re: [PATCH] [GCC] Don't use section anchors for declarations that don't fit in a single anchor range

2016-08-18 Thread James Greenhalgh

On Wed, Aug 17, 2016 at 09:09:07PM -0600, Jeff Law wrote: > On 08/17/2016 02:23 AM, Richard Biener wrote: > >On Tue, Aug 16, 2016 at 6:06 PM, Jeff Law wrote: > >>On 08/16/2016 08:01 AM, Tamar Christina wrote: > >>> > >>> > >>>Hi All, > >>> > >>>This patch turns off the usage of section anchors for

Re: [PATCH][Aarch64][gcc] Fix vld2/3/4 on big endian systems

2016-08-30 Thread James Greenhalgh

On Thu, Aug 18, 2016 at 10:15:12AM +0100, Tamar Christina wrote: > Hi all, > > This fixes a bug in the vector load functions in which they load the > vector in the wrong order for big endian systems. This patch flips the > order conditionally in the vec_concats. > > No testcase given because plen

Re: Implement C _FloatN, _FloatNx types [version 6]

2016-08-31 Thread James Greenhalgh

On Fri, Aug 19, 2016 at 04:23:55PM +, Joseph Myers wrote: > On Fri, 19 Aug 2016, Szabolcs Nagy wrote: > > > On 17/08/16 21:17, Joseph Myers wrote: > > > Although there is HFmode support for ARM and AArch64, use of that for > > > _Float16 is not enabled. Supporting _Float16 would require addit

[Patch AArch64] Add floatdihf2 and floatunsdihf2 patterns

2016-09-06 Thread James Greenhalgh

ested on aarch64-none-linux-gnu OK for trunk? James --- 2016-09-06 James Greenhalgh * config/aarch64/aarch64.md (sihf2): Convert to expand. (dihf2): Likewise. (aarch64_fp16_hf2): New. 2016-09-06 James Greenhalgh * gcc.target/aarch64/floatdihf2_1.c: New.

[Patch RFC] Modify excess precision logic to permit FLT_EVAL_METHOD=16

2016-09-06 Thread James Greenhalgh

quot; from an _Float16 type in the case that -fexcess-precision=fast. If we don't do this, then the "fast" case will spend more time promoting and demoting between HFmode and SFmode and the consequence will be slower code. Bootstrapped on AArch64 and x86_64. OK? Thanks, Jame

[Patch libgcc] Enable HCmode multiply and divide (mulhc3/divhc3)

2016-09-07 Thread James Greenhalgh

Hi, This patch arranges for half-precision complex multiply and divide routines to be built if __LIBGCC_HAS_HF_MODE__. This will be true if the target supports the _Float16 type. OK? Thanks, James --- libgcc/ 2016-09-07 James Greenhalgh * Makefile.in (lib2funcs): Build _mulhc3

Re: [PATCH 1/3][AArch64] Add more choices for the reciprocal square root approximation

2016-05-25 Thread James Greenhalgh

On Wed, Apr 27, 2016 at 04:13:33PM -0500, Evandro Menezes wrote: >gcc/ > * config/aarch64/aarch64-protos.h > (AARCH64_APPROX_MODE): New macro. > (AARCH64_APPROX_{NONE,SP,DP,DFORM,QFORM,SCALAR,VECTOR,ALL}): >Likewise. > (tune_params): New member "approx_rsqrt_

Re: [PATCH 2/3][AArch64] Emit square root using the Newton series

2016-05-25 Thread James Greenhalgh

On Wed, Apr 27, 2016 at 04:15:45PM -0500, Evandro Menezes wrote: >gcc/ > * config/aarch64/aarch64-protos.h > (aarch64_emit_approx_rsqrt): Replace with new function > "aarch64_emit_approx_sqrt". > (tune_params): New member "approx_sqrt_modes". > * config/a

Re: [PATCH 3/3][AArch64] Emit division using the Newton series

2016-05-25 Thread James Greenhalgh

On Wed, Apr 27, 2016 at 04:15:53PM -0500, Evandro Menezes wrote: >gcc/ > * config/aarch64/aarch64-protos.h > (tune_params): Add new member "approx_div_modes". > (aarch64_emit_approx_div): Declare new function. > * config/aarch64/aarch64.c > (generic_tunin

Re: [AArch64, 2/4] Extend vector mutiply by element to all supported modes

2016-05-26 Thread James Greenhalgh

On Wed, May 18, 2016 at 02:13:53PM +0100, Jiong Wang wrote: > Thanks for reporting this. > > Yes, reproduced. I should force those res* local variable into > memory so they can be in the same order as the expected result > which is kept in memory. > > The following patch fix this. > > vmul_elem_

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-05-26 Thread James Greenhalgh

On Mon, May 16, 2016 at 11:38:04AM +0100, Wilco Dijkstra wrote: > GCC expands switch statements in a very simplistic way and tries to use a > table > expansion even when it is a bad idea for performance or codesize. > GCC typically emits extremely sparse tables that contain mostly default > entri

Re: [PATCH/AARCH64/ILP32] Fix unwinding (libgcc)

2016-05-26 Thread James Greenhalgh

On Wed, Apr 27, 2016 at 02:13:21PM -0700, Andrew Pinski wrote: > Hi, > AARCH64 ILP32 is like x32 where UNITS_PER_WORD > sizeof(void*) so we > need to define REG_VALUE_IN_UNWIND_CONTEXT for ILP32. This fixes > unwinding through the signal handler. This is independent of the ABI > which Linux ker

Re: [PATCH][AArch64] Tie operand 1 to operand 0 in AESMC pattern when AES/AESMC fusion is enabled

2016-05-26 Thread James Greenhalgh

On Fri, May 20, 2016 at 11:04:32AM +0100, Kyrill Tkachov wrote: > Hi all, > > The recent -frename-registers change exposed a deficiency in the way we fuse > AESE/AESMC instruction pairs in aarch64. > > Basically we want to enforce: > AESE Vn, _ > AESMC Vn, Vn > > to enable the fusion, bu

Re: [PATCH][AArch64] Adjust SIMD integer preference

2016-05-26 Thread James Greenhalgh

On Fri, Apr 22, 2016 at 03:35:42PM +, Wilco Dijkstra wrote: > SIMD operations like combine prefer to have their operands in FP registers, > so increase the cost of integer registers slightly to avoid unnecessary > int<->FP moves. This improves register allocation of scalar SIMD operations. I r

Re: [AArch64][1/4] Enable tree-stdarg pass for AArch64 by defining counter fields

2016-05-26 Thread James Greenhalgh

On Fri, May 06, 2016 at 04:00:13PM +0100, Jiong Wang wrote: > This patch initialize va_list_gpr_counter_field and > va_list_fpr_counter_field properly for AArch64 backend that tree-stdarg > pass will be enabled. > > The "required register" analysis is largely target independent, but the > user mig

Re: [AArch64][2/4] PR63596, honor tree-stdarg analysis result to improve VAARG codegen

2016-05-26 Thread James Greenhalgh

On Fri, May 06, 2016 at 04:00:28PM +0100, Jiong Wang wrote: > This patch fixes PR63596. > > There is no need to push/pop all arguments registers. We only need to > push and pop those registers used. These use info is calculated by a > dedicated vaarg optimization tree pass "tree-stdarg", the backe

Re: [PATCH][AArch64] Delete obsolete CC_ZESWP and CC_SESWP CC modes

2016-05-27 Thread James Greenhalgh

On Wed, Apr 27, 2016 at 03:12:10PM +0100, Kyrill Tkachov wrote: > Hi all, > > The CC_ZESWP and CC_SESWP are not used anywhere and seem to be a remmant of > some > old code that was removed. The various compare+extend patterns in aarch64.md > don't > use these modes. So it should be safe to remov

Re: [PATCH][AArch64] Remove aarch64_cannot_change_mode_class

2016-05-27 Thread James Greenhalgh

On Thu, May 19, 2016 at 12:23:32PM +0100, Wilco Dijkstra wrote: > Remove aarch64_cannot_change_mode_class as the underlying issue > (PR67609) has been resolved. This avoids a few unnecessary lane > widening operations like: > > faddp d18, v18.2d > mov d18, v18.d[0] > > Passes regress, OK

Re: [PATCH][AArch64] Simplify ashl3 expander for SHORT modes

2016-05-27 Thread James Greenhalgh

On Wed, Apr 27, 2016 at 03:10:47PM +0100, Kyrill Tkachov wrote: > Hi all, > > The ashl3 expander for QI and HI modes is needlessly obfuscated. > The 2nd operand predicate accepts nonmemory_operand but the expand code > FAILs if it's not a CONST_INT. We can just demand a const_int_operand in > the

Re: [PATCH][2/3][AArch64] Keep CTZ components together until after reload

2016-05-27 Thread James Greenhalgh

On Thu, May 26, 2016 at 10:53:07AM +0100, Kyrill Tkachov wrote: > Hi all, > > In a similar rationale to patch 1/3 this patch changes the AArch64 backend to > keep the CTZ expression as a single RTX until after reload when it is split > into an RBIT and a CLZ instruction. This enables CTZ-specific

Re: [PATCH, AArch64] atomics: prefetch the destination for write prior to ldxr/stxr loops

2016-05-27 Thread James Greenhalgh

On Tue, Mar 15, 2016 at 03:31:30PM +, James Greenhalgh wrote: > On Mon, Mar 07, 2016 at 10:54:25PM -0800, Andrew Pinski wrote: > > On Mon, Mar 7, 2016 at 8:12 PM, Yangfei (Felix) > > wrote: > > >> On Mon, Mar 7, 2016 at 7:27 PM, Yangfei (Felix)

Re: [AArch64, 1/6] Reimplement scalar fixed-point intrinsics

2016-05-27 Thread James Greenhalgh

On Tue, May 24, 2016 at 09:23:36AM +0100, Jiong Wang wrote: > This patch reimplement scalar intrinsics for conversion between floating- > point and fixed-point. > > Previously, all such intrinsics are implemented through inline assembly. > This patch added RTL pattern for these operations that tho

Re: [AArch64, 4/6] Reimplement frsqrts intrinsics

2016-05-27 Thread James Greenhalgh

On Tue, May 24, 2016 at 09:23:53AM +0100, Jiong Wang wrote: > Similar as [3/6], these intrinsics were implemented before the instruction > pattern "aarch64_rsqrts" added, that these intrinsics were implemented > through inline assembly. > > This mirgrate the implementation to builtin. > > gcc/ >

Re: [AArch64, 3/6] Reimplement frsqrte intrinsics

2016-05-27 Thread James Greenhalgh

On Tue, May 24, 2016 at 09:23:48AM +0100, Jiong Wang wrote: > These intrinsics were implemented before the instruction pattern > "aarch64_rsqrte" added, that these intrinsics were implemented through > inline assembly. > > This mirgrate the implementation to builtin. > > gcc/ > 2016-05-23 Jiong

Re: [AArch64, 5/6] Reimplement fabd intrinsics & merge rtl patterns

2016-05-27 Thread James Greenhalgh

On Tue, May 24, 2016 at 09:23:58AM +0100, Jiong Wang wrote: > These intrinsics were implemented before "fabd_3" introduces. > Meanwhile > the patterns "fabd_3" and "*fabd_scalar3" can be merged into a > single "fabd3" using VALLF. > > This patch migrate the implementation to builtins backed by thi

Re: [AArch64, 6/6] Reimplement vpadd intrinsics & extend rtl patterns to all modes

2016-05-27 Thread James Greenhalgh

On Tue, May 24, 2016 at 09:24:03AM +0100, Jiong Wang wrote: > These intrinsics was implemented by inline assembly using "faddp" > instruction. > There was a pattern "aarch64_addpv4sf" which supportsV4SF mode only > while we can > extend this pattern to support VDQF mode, then we can reimplement the

Re: [PATCH 3/3][AArch64] Emit division using the Newton series

2016-05-31 Thread James Greenhalgh

On Fri, May 27, 2016 at 05:57:30PM -0500, Evandro Menezes wrote: > On 05/25/16 11:16, James Greenhalgh wrote: > >On Wed, Apr 27, 2016 at 04:15:53PM -0500, Evandro Menezes wrote: > >>gcc/ > >> * config/aarch64/aarch64-protos.h > >> (tune_param

Re: [PATCH][AArch64] Enable -frename-registers at -O2 and higher

2016-05-31 Thread James Greenhalgh

On Fri, May 27, 2016 at 02:50:15PM +0100, Kyrill Tkachov wrote: > Hi all, > > As mentioned in https://gcc.gnu.org/ml/gcc-patches/2016-05/msg00297.html, > frename-registers registers can be beneficial for aarch64 and the patch at > https://gcc.gnu.org/ml/gcc-patches/2016-05/msg01618.html resolves t

Re: [PATCH][AArch64] Remove aarch64_simd_attr_length_move

2016-05-31 Thread James Greenhalgh

On Fri, May 27, 2016 at 05:57:17PM +0100, Kyrill Tkachov wrote: > Hi all, > > I notice that we can do without aarch64_simd_attr_length_move. The move > alternatives for > the OI,CI,XImode modes that involve memory operands all use a single > load/store so are always > length 4, whereas the regis

Re: [PATCH][AArch64] Use aarch64_fusion_enabled_p to check for insn fusion capabilities

2016-05-31 Thread James Greenhalgh

On Fri, May 27, 2016 at 06:10:42PM -0500, Evandro Menezes wrote: > On 05/27/16 11:59, Kyrill Tkachov wrote: > >Hi all, > > > >This patch is a small cleanup that uses the newly introduced > >aarch64_fusion_enabled_p predicate > >to check for what fusion opportunities are enabled for the current > >t

Re: [PATCH] AARCH64: Remove spurious attribute unused from NEON intrinsic

2016-05-31 Thread James Greenhalgh

On Mon, Apr 25, 2016 at 05:47:57PM +0100, James Greenhalgh wrote: > On Mon, Apr 25, 2016 at 05:39:45PM +0200, Wladimir J. van der Laan wrote: > > > > Thanks for the info with regard to contributing, > > > > On Fri, Apr 22, 2016 at 09:40:11AM +0100, James Greenhalgh w

Re: [PATCH AArch64]Support missing vcond pattern by adding/using vec_cmp/vcond_mask patterns.

2016-05-31 Thread James Greenhalgh

On Tue, May 17, 2016 at 09:02:22AM +, Bin Cheng wrote: > Hi, > Alan and Renlin noticed that some vcond patterns are not supported in > AArch64(or AArch32?) backend, and they both had some patches fixing this. > After investigation, I agree with them that vcond/vcondu in AArch64's backend > shou

Re: [AArch64][3/4] Don't generate redundant checks when there is no composite arg

2016-05-31 Thread James Greenhalgh

On Fri, May 06, 2016 at 04:00:40PM +0100, Jiong Wang wrote: > 2016-05-06 Jiong Wang > > gcc/ > * config/aarch64/aarch64.c (aarch64_gimplify_va_arg_expr): Avoid > duplicated check code. > > gcc/testsuite/ > * gcc.target/aarch64/va_arg_4.c: New testcase. I wonder whether this is safe for

Re: [PATCH 1/3][AArch64] Add more choices for the reciprocal square root approximation

2016-06-01 Thread James Greenhalgh

On Fri, May 27, 2016 at 05:57:23PM -0500, Evandro Menezes wrote: > From 86d7690632d03ec85fd69bfaef8e89c0542518ad Mon Sep 17 00:00:00 2001 > From: Evandro Menezes > Date: Thu, 3 Mar 2016 18:13:46 -0600 > Subject: [PATCH 1/3] [AArch64] Add more choices for the reciprocal square root > approximation

Re: [PATCH 2/3][AArch64] Emit square root using the Newton series

2016-06-01 Thread James Greenhalgh

On Fri, May 27, 2016 at 05:57:26PM -0500, Evandro Menezes wrote: > 2016-04-04 Evandro Menezes > Wilco Dijkstra > > gcc/ > * config/aarch64/aarch64-protos.h > (aarch64_emit_approx_rsqrt): Replace with new function > "aarch64_emit_approx_sqrt". > (cpu_approx_

Re: [PATCH][AArch64] Add missing fcsel in Cortex-A57 scheduler

2016-06-02 Thread James Greenhalgh

On Thu, Jun 02, 2016 at 04:09:32PM +, Wilco Dijkstra wrote: > The Cortex-A57 scheduler is missing fcsel, so add it. > > OK for commit? OK. Thanks, James > > ChangeLog: > 2016-06-02 Wilco Dijkstra > > * config/arm/cortex-a57.md (cortex_a57_fp_cpys): Add fcsel. > > --- > diff --gi

Re: [PATCH][wwwdocs][AArch64] Mention -mcpu=qdf24xx support for GCC 6

2016-06-02 Thread James Greenhalgh

On Thu, Jun 02, 2016 at 03:54:43PM +0100, Kyrill Tkachov wrote: > Hi all, > > As discussed some time ago with Jim, here's the AArch64 note mentioning the > support for Qualcomm QDF24xx that was added in GCC 6. > > Ok to commit? OK. Thanks, James > Index: htdocs/gcc-6/changes.html > ===

[RFC: Patch 4/6] Modify cost model for noce_cmove_arith

2016-06-02 Thread James Greenhalgh

cost model should rely on the target giving back good information. A target that finds tests failing after this patch should consider either reducing the cost of a conditional move sequence, or increasing TARGET_RTX_BRANCH_COST. OK? Thanks, James --- gcc/ 2016-06-02 James Greenhalgh

< 4 5 6 7 8 9 10 11 12 13 >

801 - 900 of 1638 matches

Mail list logo