Re: [PATCH v2] aarch64: disable LDP via tuning structure for -mcpu=ampere1/1a

2023-04-17 Thread Philipp Tomsich
Applied to master, thanks! Philipp. On Mon, 17 Apr 2023 at 11:56, Kyrylo Tkachov wrote: > > > > -Original Message- > > From: Philipp Tomsich > > Sent: Friday, April 14, 2023 7:06 PM > > To: gcc-patches@gcc.gnu.org > > Cc: Kyrylo Tkachov ; Philip

Re: [PATCH v2] aarch64: disable LDP via tuning structure for -mcpu=ampere1/1a

2023-04-17 Thread Philipp Tomsich
OK for backport? This will be all the way down to GCC10, as I just realized that we need to backport the entire ampere1/1a support to GCC10 (we stopped at GCC11 for some unexplainable reason)... Philipp. On Mon, 17 Apr 2023 at 12:20, Philipp Tomsich wrote: > > Applied to master,

Re: [PATCH v2] aarch64: disable LDP via tuning structure for -mcpu=ampere1/1a

2023-04-17 Thread Philipp Tomsich
On Mon, 17 Apr 2023 at 17:07, Kyrylo Tkachov wrote: > > > > > -Original Message- > > From: Philipp Tomsich > > Sent: Monday, April 17, 2023 11:22 AM > > To: Kyrylo Tkachov > > Cc: gcc-patches@gcc.gnu.org; Di Zhao > > Subject: Re: [PATCH v2]

[PATCH 0/2, AArch64] APM X-Gene 1 cost-table and pipeline model

2014-11-19 Thread Philipp Tomsich
As briefly discussed with Marcus yesterday, I'm attaching two patches to enable a mode accurate instruction selection and scheduling on the APM X-Gene 1. Ok for master? -Philipp. Philipp Tomsich (2): Core definition for APM XGene-1 and associated cost-table. Pipeline model for APM

[PATCH 1/2, AArch64] Core definition for APM XGene-1 and associated cost-table.

2014-11-19 Thread Philipp Tomsich
@@ -1,3 +1,10 @@ +2014-11-19 Philipp Tomsich + + * config/aarch64/aarch64-cores.def (xgene1): Update/add the + xgene1 (APM XGene-1) core definition. + * gcc/config/aarch64/aarch64.c: Add cost tables for APM XGene-1 + * config/arm/aarch-cost-tables.h: Add cost tables for APM

[PATCH 2/2, AArch64] Pipeline model for APM XGene-1.

2014-11-19 Thread Philipp Tomsich
/ChangeLog index 5b389c5..9cc3b5a 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,5 +1,11 @@ 2014-11-19 Philipp Tomsich + * config/aarch64/aarch64.md: Include xgene1.md. + (generic_sched): Set to no for xgene1. + * config/arm/xgene1.md: New file. + +2014-11-19 Philipp Tomsich

[PATCH 1/2, AArch64, v2] Core definition for APM XGene-1 and associated cost-table.

2014-11-19 Thread Philipp Tomsich
/gcc/ChangeLog b/gcc/ChangeLog index 2fa58ca..c9ac0d9 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,11 @@ +2014-11-19 Philipp Tomsich + + * config/aarch64/aarch64-cores.def (xgene1): Update/add the + xgene1 (APM XGene-1) core definition. + * gcc/config/aarch64/aarch64.c

[PATCH 2/2, AArch64, v2] Pipeline model for APM XGene-1.

2014-11-19 Thread Philipp Tomsich
a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,5 +1,11 @@ 2014-11-19 Philipp Tomsich + * config/aarch64/aarch64.md: Include xgene1.md. + (generic_sched): Set to no for xgene1. + * config/arm/xgene1.md: New file. + +2014-11-19 Philipp Tomsich + * config/aarch64/aarch64-cores

[PATCH 0/2, AArch64, v3] APM X-Gene 1 cost-table and pipeline model

2014-11-21 Thread Philipp Tomsich
colleagues regarding the latencies and modelling of divides in the pipeline, we've readjusted the modelling of the divides another time... even though it doesn't make a difference in real-world benchmarks. Thanks to everyone who took the time to review and comment. Philipp Tomsich (2):

[PATCH 1/2] Core definition for APM XGene-1 and associated cost-table.

2014-11-21 Thread Philipp Tomsich
100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,11 @@ +2014-11-19 Philipp Tomsich + + * config/aarch64/aarch64-cores.def (xgene1): Update/add the + xgene1 (APM XGene-1) core definition. + * gcc/config/aarch64/aarch64.c: Add cost tables for APM XGene-1 + * config/arm

[PATCH 2/2] Pipeline model for APM XGene-1.

2014-11-21 Thread Philipp Tomsich
/ChangeLog index c9ac0d9..dad2278 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,5 +1,11 @@ 2014-11-19 Philipp Tomsich + * config/aarch64/aarch64.md: Include xgene1.md. + (generic_sched): Set to no for xgene1. + * config/arm/xgene1.md: New file. + +2014-11-19 Philipp Tomsich

Re: [PATCH v3] [aarch64] Add CPU support for Ampere Computing's eMAG.

2018-11-21 Thread Philipp Tomsich
This is currently slowed down by the speed of subversion (as my subversion tree was outdated). So it should only be a matter of days ... ;-) > On 21.11.2018, at 12:15, Christoph Müllner > wrote: > >> >> On 21.11.2018, at 11:26, Kyrill Tkachov wrote: >> >> Hi Christoph, >> >> On 20/11/18 18

Re: [PATCH v3] [aarch64] Correct the maximum shift amount for shifted operands.

2018-11-27 Thread Philipp Tomsich
Sam, > On 27.11.2018, at 14:06, Sam Tebbs wrote: > > > On 11/26/18 7:50 PM, Christoph Muellner wrote: >> The aarch64 ISA specification allows a left shift amount to be applied >> after extension in the range of 0 to 4 (encoded in the imm3 field). >> >> This is true for at least the following i

Re: [PATCH v3] [aarch64] Correct the maximum shift amount for shifted operands.

2018-11-28 Thread Philipp Tomsich
x0, x1, x0 >> 8: d65f03c0ret >> >> With the patch the ubfiz will be merged into the add instruction: >> >> : >> 0: 8b211000add x0, x0, w1, uxtb #4 >> 4: d65f03c0ret >> >> Tested with "

[PATCH 1/4] Core definition for APM XGene-1 and associated cost-table.

2015-01-12 Thread Philipp Tomsich
..dd49d7f 100644 --- a/gcc/ChangeLog-2014 +++ b/gcc/ChangeLog-2014 @@ -5350,6 +5350,14 @@ optimization of ashiftrt of subreg of lshiftrt, check that code is ASHIFTRT. +2014-11-19 Philipp Tomsich + + * config/aarch64/aarch64-cores.def (xgene1): Update/add the + xgene1

[PATCH 3/4] Change the type of the prefetch-instructions to 'prefetch'.

2015-01-12 Thread Philipp Tomsich
--- gcc/config/aarch64/aarch64.md | 2 +- gcc/config/arm/types.md | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 1f6b1b6..98f4f30 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aar

[PATCH 2/4] Pipeline model for APM XGene-1.

2015-01-12 Thread Philipp Tomsich
--- gcc/config/aarch64/aarch64.md | 1 + gcc/config/arm/xgene1.md | 531 ++ 2 files changed, 532 insertions(+) create mode 100644 gcc/config/arm/xgene1.md diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 12e1054..1f6b

[PATCH 0/4, AArch64, v4] APM X-Gene 1 cost-table and pipeline model

2015-01-12 Thread Philipp Tomsich
in LE and BE configurations) and for AArch32 (arm-none-eabi). I'd be grateful, if you could apply at least the AArch64 patches. Best, Phil. Philipp Tomsich (4): Core definition for APM XGene-1 and associated cost-table. Pipeline model for APM XGene-1. Change the type of the prefetch-i

[PATCH 4/4] Wire X-Gene 1 up in the ARM (32bit) backend as a AArch32-capable core.

2015-01-12 Thread Philipp Tomsich
@@ 63965. * config/rs6000/rs6000.c: Likewise. +2014-12-23 Philipp Tomsich + + * config/arm/arm.md (generic_sched): Specify xgene1 in 'no' list. + Include xgene1.md. + * config/arm/arm.c (arm_issue_rate): Specify 4 for xgene1. + * config/arm/arm

[AArch64 02/14] Add "xgene1" core identifier.

2014-02-18 Thread Philipp Tomsich
* aarch64/aarch64-cores.def: Add "xgene1". --- gcc/config/aarch64/aarch64-cores.def | 1 + gcc/config/aarch64/aarch64-tune.md | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def index 1039660..b4f6c16

[AArch64 00/14] Pipeline-independent changes for XGene-1

2014-02-18 Thread Philipp Tomsich
e-1. Given that the matching/structure of this cost-model is different from the existing implementation, we've chosen to keep this in a separate function for the time being. Philipp Tomsich (14): Use "generic" target, if no other default. Add "xgene1" core identifier.

[AArch64 01/14] Use "generic" target, if no other default.

2014-02-18 Thread Philipp Tomsich
The default target should be "generic", as Cortex-A53 includes optional ISA features (CRC and CRYPTO) that are not required for architectural compliance. The key difference between generic (which already uses the cortexa53 pipeline model for scheduling) is the absence of any optional ISA features i

[AArch64 03/14] Retrieve BRANCH_COST from tuning structure.

2014-02-18 Thread Philipp Tomsich
The BRANCH_COST affects whether conditional instructions (e.g. conditional moves) will be used in transforms in the middle-end. This change makes the branch_cost configurable from within the target tuning structure. --- gcc/config/aarch64/aarch64-protos.h | 2 ++ gcc/config/aarch64/aarch64.c

[AArch64 06/14] Extend '*tb1'.

2014-02-18 Thread Philipp Tomsich
The '*tb1' can safely be extended to match operands of any size, as long as the immediate operand (i.e. the bits tested) match the size of the register operand. This removes unnecessary zero-extension operations from the generated instruction stream. --- gcc/config/aarch64/aarch64.md | 4 ++-- 1

[AArch64 07/14] Define additional patterns for adds/subs.

2014-02-18 Thread Philipp Tomsich
--- gcc/config/aarch64/aarch64.md | 49 +++ 1 file changed, 49 insertions(+) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 90f1ee9..13a75d3 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@

[AArch64 04/14] Correct the maximum shift amount for shifted operands.

2014-02-18 Thread Philipp Tomsich
--- gcc/config/aarch64/aarch64.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 43e4612..4327eb3 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -4409,7 +4409,7 @@ aarch64_output_

[AArch64 05/14] Add AArch64 'prefetch'-pattern.

2014-02-18 Thread Philipp Tomsich
--- gcc/config/aarch64/aarch64.md | 17 + gcc/config/arm/types.md | 2 ++ 2 files changed, 19 insertions(+) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 99a6ac8..b972a1b 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/

[AArch64 10/14] Add movcc definition for GPF case.

2014-02-18 Thread Philipp Tomsich
--- gcc/config/aarch64/aarch64.md | 19 +++ 1 file changed, 19 insertions(+) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index c72d123..b6453b6 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -2460,6 +2460,25 @@ }

[AArch64 11/14] Optimize and(s) patterns for HI/QI operands.

2014-02-18 Thread Philipp Tomsich
HImode and QImode operands can be handled in a more optimal way for logical AND than for logical OR operations. An AND will never set bits that are not already set in its operands, so the resulting mode/precision depends on the least precision of its operands with an implicit zero-extension to any

[AArch64 14/14] Add cost-model for XGene-1.

2014-02-18 Thread Philipp Tomsich
This completely rewritten cost-model provides a like-for-like benefit of approx. 3% on CoreMark. --- gcc/config/aarch64/aarch64.c | 885 ++- 1 file changed, 881 insertions(+), 4 deletions(-) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aar

[AArch64 13/14] Initial tuning description for XGene-1 core.

2014-02-18 Thread Philipp Tomsich
The generic cost model for AArch64 can not be used to capture the microarchitectural cost of XGene-1 in full detail. For this reason, we use the basic tuning model of the Cortex-A53 for now. --- gcc/config/aarch64/aarch64-cores.def | 2 +- gcc/config/aarch64/aarch64.c | 28 ++

[AArch64 08/14] Define a variant of cmp for the CC_NZ case.

2014-02-18 Thread Philipp Tomsich
This pattern is not strictly necessary and a similar effect could be achieved through the use of a suitable compatibility relation for CC modes; in the meantime, this helps on some benchmarks. --- gcc/config/aarch64/aarch64.md | 13 + 1 file changed, 13 insertions(+) diff --git a/gcc/

[AArch64 09/14] Add special cases of zero-extend w/ compare operations.

2014-02-18 Thread Philipp Tomsich
--- gcc/config/aarch64/aarch64.md | 56 +++ 1 file changed, 56 insertions(+) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 60e42af..c72d123 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@

[AArch64 12/14] Generate 'bics', when only interested in CC_NZ.

2014-02-18 Thread Philipp Tomsich
A specialized variant of '*and_one_cmpl3_compare0' is needed to match some cases (during the combine stage) that could be folded into a bics, when the output result is not used (i.e. when only the condition code is of interest). This is useful both for CoreMark and SPEC workloads. --- gcc/config/

Re: [PATCH v2] aarch64: Add support for Ampere-1B (-mcpu=ampere1b) CPU

2024-10-11 Thread Philipp Tomsich
We just noticed that we didn't request to backport this one… OK for backport? On Thu, 30 Nov 2023 at 00:55, Philipp Tomsich wrote: > Applied to master, thanks! > Philipp. > > On Tue, 28 Nov 2023 at 12:57, Richard Sandiford > wrote: > > > > Philipp Tomsich writ

Re: [PATCH v4] match: Fix A || B not optimized to true when !B implies A [PR114326]

2024-09-25 Thread Philipp Tomsich
> > > gcc/ChangeLog: > > > > * match.pd: Add two patterns to fold a ^ b to 0, when a == b. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.dg/tree-ssa/fold-xor-and-or.c: New test. > > * gcc.dg/tree-ss

Re: [PATCH v2] match: Change (A * B) + (-C) to (B - C/A) * A, if C multiple of A [PR109393]

2024-09-25 Thread Philipp Tomsich
OK. > > Thanks, > Richard. > > > PR tree-optimization/109393 > > > > gcc/ChangeLog: > > > > * match.pd: (A * B) + (-C) -> (B - C/A) * A, if C a multiple of A. > > > > gcc/testsuite/ChangeLog: > > > > * gc

Re: [PATCH v8] Target-independent store forwarding avoidance.

2024-11-24 Thread Philipp Tomsich
void-store-forwarding-1.c: New test. > > * gcc.target/aarch64/avoid-store-forwarding-2.c: New test. > > * gcc.target/aarch64/avoid-store-forwarding-3.c: New test. > > * gcc.target/aarch64/avoid-store-forwarding-4.c: New test. > > * gcc.target/aarch64/avoi

Re: [PATCH v8] Target-independent store forwarding avoidance.

2024-11-27 Thread Philipp Tomsich
On Thu 28. Nov 2024 at 15:36, Richard Biener wrote: > On Mon, Nov 25, 2024 at 3:28 AM Philipp Tomsich > wrote: > > > > Pushed to master with the following fixups: > > - new timevar added > > - nits addressed > > - whitespace fixes > > The pass

Re: [PATCH] avoid-store-forwarding: Reject changes when an instruction may throw [PR117816]

2024-12-06 Thread Philipp Tomsich
Applied to master. Thanks! --Philipp. On Fri, 6 Dec 2024 at 06:03, Jeff Law wrote: > > > > On 12/5/24 6:18 AM, Konstantinos Eleftheriou wrote: > > From: kelefth > > > > Avoid-store-forwarding doesn't handle the case where an instruction in the > > store-load sequence contains a REG_EH_REGION no

Re: [PATCH] avoid-store-forwarding: Fix base register initialization when eliminating loads [PR117835]

2024-12-30 Thread Philipp Tomsich
Applied to master (with fixed-up commit message). Thanks! --Philipp. On Sun, 29 Dec 2024 at 17:58, Jeff Law wrote: > > > > On 12/17/24 4:51 AM, Konstantinos Eleftheriou wrote: > > From: kelefth > > > > During the initialization of the base register for the zero-offset store, in > > the case th

Re: [PATCH] testsuite: Exclude test in pr109393.c from ilp32 targets [PR116845]

2025-02-04 Thread Philipp Tomsich
Applied to master with the requested change (to XFAIL for ilp32). Thanks, Philipp. On Tue, 4 Feb 2025 at 12:45, Richard Biener wrote: > On Tue, Feb 4, 2025 at 12:36 PM Konstantinos Eleftheriou > wrote: > > > > From: kelefth > > > > The match.pd canonicalization that this testcase checks for,

Re: [PATCH] asf: Enable pass at O2 or higher

2025-01-29 Thread Philipp Tomsich
+JiangNing Liu On Wed, 29 Jan 2025 at 10:38, Richard Biener wrote: > > On Wed, 29 Jan 2025, Christoph Müllner wrote: > > > The avoid-store-forwarding pass is disabled by default and therefore > > in the risk of bit-rotting. This patch addresses this by enabling > > the pass at O2 or higher. > >

Re: [PATCH] match: Change (A + CST0) * CST1 to (A + sign_extend(CST0)) * CST1 [PR116845]

2025-01-17 Thread Philipp Tomsich
Folks, we'd appreciate it if someone could take the time to review this fix for PR116845. Thanks, Philipp. On Tue, 31 Dec 2024 at 10:03, Konstantinos Eleftheriou wrote: > > From: kelefth > > `(A * B) + (-C) to (B - C/A) * A` fails to match on ILP32 targets due to > the upper bits of CST0 bei

Re: [PATCH] avoid-store-forwarding: Fix reg init on load-elimination [PR119160]

2025-04-05 Thread Philipp Tomsich
Jeff, On Sun, 30 Mar 2025 at 01:48, Jeff Law wrote: > > > > On 3/28/25 5:12 AM, Konstantinos Eleftheriou wrote: > > In the case that we are eliminating the load instruction, we use zero_extend > > for the initialization of the base register for the zero-offset store. > > This causes issues when

Re: [PATCH] avoid-store-forwarding: Fix reg init on load-elimination [PR119160]

2025-04-18 Thread Philipp Tomsich
Applied to trunk (16.0.0), thank you! Should this be backported to the GCC-15 release branch as well? --Philipp. On Mon, 31 Mar 2025 at 10:10, Philipp Tomsich wrote: > > Jeff, > > > On Sun, 30 Mar 2025 at 01:48, Jeff Law wrote: > > > > > > > > On 3/28/25

Re: [PING][PATCH] doc: Clarify REG_EH_REGION note usage

2025-04-18 Thread Philipp Tomsich
Applied to trunk, thank you! --Philipp. On Thu, 17 Apr 2025 at 21:51, Jeff Law wrote: > > > > On 4/8/25 6:12 AM, Konstantinos Eleftheriou wrote: > > Hi, > > Just a ping for https://gcc.gnu.org/pipermail/gcc-patches/2025- > > March/677635.html >

Re: [PATCH] avoid-store-forwarding: Fix reg init on load-elimination [PR119160]

2025-04-22 Thread Philipp Tomsich
cally, and assuming that there is sufficient reviewer bandwidth, we could land the "enabled by default" on trunk towards the end of next week. Thanks, Philipp. On Sat, 19 Apr 2025 at 14:41, Jeff Law wrote: > > > > On 4/18/25 4:37 PM, Sam James wrote: > > Philipp Toms

Re: [PATCH] testsuite: Skip pr119160 for RISC-V backend.

2025-05-08 Thread Philipp Tomsich
+Konstantinos Eleftheriou On Thu, 8 May 2025 at 10:30, Andreas Schwab wrote: > > On Mai 08 2025, Richard Biener wrote: > > > On Thu, May 8, 2025 at 10:02 AM Jiawei wrote: > >> > >> RISC-V backend don't support '-mgeneral-regs-only' option, skip it. > >> https://godbolt.org/z/38M8vPW74 > > > > T

Re: [PATCH v3 1/3] sbitmap: Add bitmap_bit_in_range_p_1 helper function

2025-05-19 Thread Philipp Tomsich
On Mon, 19 May 2025 at 16:10, Konstantinos Eleftheriou wrote: > > This patch adds the `bitmap_bit_in_range_p_1` helper function, > in order to be used by `bitmap_bit_in_range_p`. The helper function > contains the previous implementation of `bitmap_bit_in_range_p` and > `bitmap_bit_in_range_p` has

Re: [PATCH v3 3/3] asf: Fix calling of emit_move_insn on registers of different modes [PR119884]

2025-05-27 Thread Philipp Tomsich
Thanks everyone — I just applied the series (including the requested) changes to trunk. We'll also send the committed series (v4) to the list for archival. Philipp. On Tue, 20 May 2025 at 14:31, Richard Sandiford wrote: > > Konstantinos Eleftheriou writes: > > This patch uses `lowpart_subreg`

Re: [PATCH][AArch64] Support for LDP/STP of Q-registers

2018-06-05 Thread Dr. Philipp Tomsich
> On 5 Jun 2018, at 19:28, James Greenhalgh wrote: > > On Tue, Jun 05, 2018 at 11:32:06AM -0500, Kyrill Tkachov wrote: >> >> On 04/06/18 18:40, Kyrill Tkachov wrote: >>> Hi all, >>> >>> This patch adds support for generating LDPs and STPs of Q-registers. >>> This allows for more compact code g

Re: [PATCH 2/2, AArch64] Pipeline model for APM XGene-1.

2014-11-19 Thread Dr. Philipp Tomsich
t... > > On 19/11/14 17:32, Philipp Tomsich wrote: >> @@ -4211,3 +4211,5 @@ >> >> ;; Atomic Operations >> (include "atomics.md") >> + >> +(include "../arm/xgene1.md") > > Do you expect to add arm support for this core? If so, y

Re: [PATCH 2/2, AArch64, v2] Pipeline model for APM XGene-1.

2014-11-20 Thread Dr . Philipp Tomsich
Kyrill, > I don't mind it being in config/arm if you plan to wire it up later, good to > know. > Another comment inline…. I’ll clean up the missing xgene1_ and the mistyped xgene_ prefix and resubmit. >> +(define_insn_reservation "div" 2 >> + (and (eq_attr "tune" "xgene1") >> + (eq_attr

Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math

2015-06-24 Thread Dr. Philipp Tomsich
Evandro, We’ve seen a 28% speed-up on gromacs in SPECfp for the (scalar) reciprocal sqrt. Also, the “reciprocal divide” patches are floating around in various of our git-tree, but aren’t ready for public consumption, yet… I’ll leave Benedikt to comment on potential timelines for getting that

Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math

2015-06-24 Thread Dr. Philipp Tomsich
er [mailto:benedikt.hu...@theobroma-systems.com] >> Sent: Wednesday, June 24, 2015 12:11 >> To: Dr. Philipp Tomsich >> Cc: Evandro Menezes; gcc-patches@gcc.gnu.org >> Subject: Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) >> estimation in -ffast-math >>

Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math

2015-06-25 Thread Dr. Philipp Tomsich
Kumar, what is the relative gain that you see on Cortex-A57? Thanks, Philipp. > On 25 Jun 2015, at 17:35, Kumar, Venkataramanan > wrote: > > Changing to "1 step for float" and "2 steps for double" gives better gains > now for gromacs on cortex-a57. > > Regards, > Venkat. >> -Original M

Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math

2015-06-29 Thread Dr. Philipp Tomsich
ikt , I have ICE for 444.namd with your patch, not sure if something > wrong in my local tree. > > Regards, > Venkat. > >> -Original Message- >> From: pins...@gmail.com [mailto:pins...@gmail.com] >> Sent: Sunday, June 28, 2015 8:35 PM >> To: Kumar, Venkata

Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math

2015-06-29 Thread Dr. Philipp Tomsich
James, On 29 Jun 2015, at 13:36, James Greenhalgh wrote: > > On Mon, Jun 29, 2015 at 10:18:23AM +0100, Kumar, Venkataramanan wrote: >> >>> -Original Message- >>> From: Dr. Philipp Tomsich [mailto:philipp.toms...@theobroma-systems.com] >>> Sen

Re: Fix ARM bootstrap - xgene tune params

2015-01-15 Thread Dr. Philipp Tomsich
Richard, Thanks for catching this. Your change is optimal for X-Gene 1. —Phil. > On 15 Jan 2015, at 19:51, Richard Earnshaw wrote: > > The recent xgene tuning parameters merge broke the ARM bootstrap, since > the tables have been extended by an additional parameter giving: > > gcc/config/arm/

Re: [PATCH][wwwdocs] Mention xgene-1 in arm and aarch64, FreeBSD support for arm

2015-02-13 Thread Dr. Philipp Tomsich
> On 13 Feb 2015, at 11:14, Richard Earnshaw > wrote: > >> Is this ok? > > The repetitive nature of all these new cpus being added looks rather > wooden. I think it would be better to merge them into one change block, > that lists all the cpus and their internal names, then mentions once at >

Re: [PATCH][AArch64] Enable -frename-registers at -O2 and higher

2016-06-15 Thread Dr. Philipp Tomsich
> On 10 Jun 2016, at 01:28, Jim Wilson wrote: > > On Tue, May 31, 2016 at 2:56 AM, James Greenhalgh > wrote: >> As you're proposing to have this on by default, I'd like to give a chance >> to hear whether there is consensus as to this being the right choice for >> the thunderx, xgene1, exynos-m

Re: [PATCH 2/2][AArch64] Add bfx attribute

2016-11-10 Thread Dr. Philipp Tomsich
> On 10 Nov 2016, at 18:14, Wilco Dijkstra wrote: > > I think the XGene-1 scheduler might need a similar change as currently all > AArch64 > shifts are modelled as 2-cycle operations. Thanks for the heads-up. We’ll indeed need to update this. Regards, Philipp.

Re: [AArch64] Remove AARCH64_EXTRA_TUNE_RECIP_SQRT from Cortex-A57 tuning

2016-01-11 Thread Dr. Philipp Tomsich
James, ok from our side—good to see that this also benefits the A57. Best, Philipp. > On 11 Jan 2016, at 13:04, James Greenhalgh wrote: > > > Hi, > > I've seen a couple of large performance issues caused by expanding > the high-precision reciprocal square root for Cortex-A57, so I'd like > t

Re: [PATCH 2/2] Pipeline model for APM XGene-1.

2014-12-05 Thread Dr. Philipp Tomsich
Should I revise, or do will you just drop tje line when applying when applying this? Thanks, Phil. > On 05 Dec 2014, at 18:23, Marcus Shawcroft wrote: > > On 21 November 2014 at 18:44, Philipp Tomsich > wrote: > >> +;; Machine description for AppliedMicro xgene1 core

Re: [PATCH, aarch64] Add prefetch support

2015-01-13 Thread Dr. Philipp Tomsich
Great. I should have an update patch-set ready & tested later tonight. Best, Phil. > On 13 Jan 2015, at 15:18, Andrew Pinski wrote: > > On Tue, Jan 13, 2015 at 6:13 AM, Marcus Shawcroft > wrote: >> On 11 January 2015 at 02:37, Andrew Pinski wrote: >>> On Tue, Nov 11, 2014 at 6:47 AM, Marcus S

Re: [AArch64 05/14] Add AArch64 'prefetch'-pattern.

2014-05-28 Thread Dr. Philipp Tomsich
On 28 May 2014, at 16:25 , Gopalasubramanian, Ganesh wrote: > Hi Philipp, > >> These changes look good to me. >> We'll try them out on the benchmarks that caused us to add prefetching in >> the first place. > > If you are OK, I would like to get these changes upstreamed. Sorry for the delay

Re: [AArch64 05/14] Add AArch64 'prefetch'-pattern.

2014-02-28 Thread Dr. Philipp Tomsich
Ganesh, On 28 Feb 2014, at 10:13 , Gopalasubramanian, Ganesh wrote: > I also have attached a patch that implements the following. > * Prefetch with immediate offset in the range 0 to 32760 (multiple of 8). > Added a predicate for this. > * Prefetch with immediate offset - in the range

Re: [PATCH] match.pd: undistribute (a << s) & C, when C = (M << s) and exact_log2(M - 1)

2020-11-11 Thread Philipp Tomsich via Gcc-patches
Jakub, On Wed, 11 Nov 2020 at 11:31, Jakub Jelinek wrote: > > On Wed, Nov 11, 2020 at 11:17:32AM +0100, Philipp Tomsich wrote: > > From: Philipp Tomsich > > > > The function > > long f(long a) > > { > > return(a & 0xull) <&

Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Philipp Tomsich via Gcc-patches
Jakub, On Tue, 17 Nov 2020 at 16:56, Jeff Law wrote: > > > > On 11/17/20 4:53 AM, Philipp Tomsich wrote: > > Jeff, > > > > On Tue, 17 Nov 2020 at 00:38, Jeff Law > <mailto:l...@redhat.com>> wrote: > > > > > > On 11/16/20 11:5

Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Philipp Tomsich via Gcc-patches
Jeff, On Tue, 17 Nov 2020 at 16:56, Jeff Law wrote: > > Note that in his comment to patch 2/2, Jim has noted that user code for > > RISC-V may assume a truncation of the shift-operand... > What I'd suggest doing would be to leave the invalid shift count in the > IL in VRP, then extend the erroneo

<    1   2   3   4