Re: [PATCH] xtensa: Remove TARGET_PROMOTE_FUNCTION_MODE

2025-06-30 Thread Takayuki 'January June' Suwa
On 2025/07/01 7:40, H.J. Lu wrote: On Mon, Jun 23, 2025 at 5:41 AM Takayuki 'January June' Suwa wrote: On 2025/06/23 6:20, H.J. Lu wrote: On Sun, Jun 22, 2025 at 9:54 PM Max Filippov wrote: On Sun, Jun 22, 2025 at 5:49 AM Takayuki 'January June' Suwa wrote: On

Re: [PATCH] xtensa: Remove TARGET_PROMOTE_FUNCTION_MODE

2025-06-22 Thread Takayuki 'January June' Suwa
On 2025/06/23 6:20, H.J. Lu wrote: On Sun, Jun 22, 2025 at 9:54 PM Max Filippov wrote: On Sun, Jun 22, 2025 at 5:49 AM Takayuki 'January June' Suwa wrote: On 2025/06/22 6:41, Max Filippov wrote: On Sat, Jun 21, 2025 at 2:12 PM Takayuki 'January June' Suwa wrote:

Re: [PATCH] xtensa: Remove TARGET_PROMOTE_FUNCTION_MODE

2025-06-22 Thread Takayuki 'January June' Suwa
On 2025/06/22 6:41, Max Filippov wrote: On Sat, Jun 21, 2025 at 2:12 PM Takayuki 'January June' Suwa wrote: That hook has since been deprecated (commit a670ebde3995481225ec62b29686ec07a21e5c10) and has led to incorrect results on Xtensa: /* example */ #define

Re: [PATCH] xtensa: Implement l(ceil|floor|round|)sfsi2 insn patterns and their scaled variants

2025-06-05 Thread Takayuki 'January June' Suwa
On 2025/06/06 8:55, Max Filippov wrote: On Thu, Jun 05, 2025 at 09:19:19PM +0900, Takayuki 'January June' Suwa wrote: On 2025/06/05 5:09, Max Filippov wrote: On Tue, Jun 3, 2025 at 7:44 AM Takayuki 'January June' Suwa wrote: By using the previously unused CEIL|FLOOR|RO

Re: [PATCH] xtensa: Implement l(ceil|floor|round|)sfsi2 insn patterns and their scaled variants

2025-06-05 Thread Takayuki 'January June' Suwa
On 2025/06/05 5:09, Max Filippov wrote: Hi Suwa-san, Hi Max, (thanks for every regtesting) On Tue, Jun 3, 2025 at 7:44 AM Takayuki 'January June' Suwa wrote: By using the previously unused CEIL|FLOOR|ROUND.S floating-point coprocessor instructions. In addition, two instructi

[PATCH 1/2] xtensa: Fix suboptimal loading of pooled constant value into hardware single-precision FP register

2024-07-23 Thread Takayuki 'January June' Suwa
d/LRA.From a552e4fca21ff9a0c7a5327dd15ccdada36930c1 Mon Sep 17 00:00:00 2001 From: Takayuki 'January June' Suwa Date: Tue, 23 Jul 2024 16:03:12 +0900 Subject: [PATCH 1/2] xtensa: Fix suboptimal loading of pooled constant value into hardware single-precision FP register We would like to implement the

[PATCH 2/2] xtensa: Add missing speed cost for TYPE_FARITH in TARGET_INSN_COST

2024-07-23 Thread Takayuki 'January June' Suwa
'January June' Suwa Date: Wed, 24 Jul 2024 06:07:06 +0900 Subject: [PATCH 2/2] xtensa: Add missing speed cost for TYPE_FARITH in TARGET_INSN_COST According to the implemented pipeline model, this cost can be assumed to be 1 clock cycle. gcc/ChangeLog: * config/xtensa

Re: [PATCH 6/6] Add a late-combine pass [PR106594]

2024-06-22 Thread Takayuki 'January June' Suwa
Hi! On 2024/06/23 1:49, Richard Sandiford wrote: Takayuki 'January June' Suwa writes: On 2024/06/20 22:34, Richard Sandiford wrote: This patch adds a combine pass that runs late in the pipeline. There are two instances: one between combine and split1, and one after postreload.

[PATCH 1/2] xtensa: Resurrect LEAF_REGISTERS and LEAF_REG_REMAP

2024-03-26 Thread Takayuki 'January June' Suwa
They were once mistakenly removed with "xtensa: Remove old broken tweak for leaf function", but caused unwanted register spills. gcc/ChangeLog: * config/xtensa/xtensa.h (LEAF_REGISTERS, LEAF_REG_REMAP): Withdraw the removal. (REG_ALLOC_ORDER): Cosmetics. * config/x

[PATCH 2/2] xtensa: Make use of std::swap where appropriate

2024-03-26 Thread Takayuki 'January June' Suwa
No functional changes. gcc/ChangeLog: * config/xtensa/xtensa.cc (gen_int_relational, gen_float_relational): Replace tempvar-based value-swapping codes with std::swap. * config/xtensa/xtensa.md (movdi_internal, movdf_internal): Ditto. --- gcc/config/xtensa/

[PATCH] xtensa: Add supplementary split pattern for "*addsubx"

2024-03-21 Thread Takayuki 'January June' Suwa
int test(int a) { return a * 4 + 3; } In the example above, since Xtensa has instructions to add register value scaled by 2, 4 or 8 (and corresponding define_insns), we would expect them to be used but not, because it is transformed before reaching the RTL generation pass as below: int tes

[PATCH 1/2 v2] xtensa: Recover constant synthesis for HImode after LRA transition

2024-02-04 Thread Takayuki 'January June' Suwa
After LRA transition, HImode constants that don't fit into signed 12 bits are no longer subject to constant synthesis: /* example */ void test(void) { short foo = 32767; __asm__ ("" :: "r"(foo)); } ;; before .literal_position .literal .LC0, 32767 te

[PATCH 1/2] xtensa: Recover constant synthesis for HImode after LRA transition

2024-02-03 Thread Takayuki 'January June' Suwa
After LRA transition, HImode constants that don't fit into signed 12 bits are no longer subject to constant synthesis: /* example */ void test(void) { short foo = 32767; __asm__ ("" :: "r"(foo)); } ;; before .literal_position .literal .LC0, 32767 te

[PATCH 2/2] xtensa: Fix missing mode warning in "*eqne_zero_masked_bits"

2024-02-03 Thread Takayuki 'January June' Suwa
gcc/ChangeLog: * config/xtensa/xtensa.md (*eqne_zero_masked_bits): Add missing ":SI" to the match_operator. --- gcc/config/xtensa/xtensa.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md index 5242eb3c0

Re: [RFC] gcc: xtensa: use salt/saltu in xtensa_expand_scc

2023-09-08 Thread Takayuki 'January June' Suwa via Gcc-patches
Hi! On 2023/09/07 23:22, Max Filippov wrote: > gcc/ > * config/xtensa/predicates.md (xtensa_cstoresi_operator): Add > unsigned comparisons. > * config/xtensa/xtensa.cc (xtensa_expand_scc): Add code > generation of salt/saltu instructions. > * config/xtensa/xtensa.h (T

[PATCH] xtensa: Optimize several boolean evaluations of EQ/NE against constant zero

2023-09-08 Thread Takayuki 'January June' Suwa via Gcc-patches
An idiomatic implementation of boolean evaluation of whether a register is zero or not in Xtensa is to assign 0 and 1 to the temporary and destination, and then issue the MOV[EQ/NE]Z machine instruction (See 8.3.2 Instruction Idioms, Xtensa ISA refman., p.599): ;; A2 = (A3 != 0) ? 1 : 0; m

Re: [PATCH] xtensa: Optimize boolean evaluation when SImode EQ/NE to zero if TARGET_MINMAX

2023-09-05 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/09/06 8:01, Max Filippov wrote: > Hi Suwa-san, Hi! > > On Tue, Sep 5, 2023 at 2:29 AM Takayuki 'January June' Suwa > wrote: >> >> This patch optimizes the boolean evaluation for equality to 0 in SImode >> using the MINU (Minimum Value Unsigne

[PATCH] xtensa: Optimize boolean evaluation when SImode EQ/NE to zero if TARGET_MINMAX

2023-09-05 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch optimizes the boolean evaluation for equality to 0 in SImode using the MINU (Minimum Value Unsigned) machine instruction available when TARGET_MINMAX is configured, for example, (x != 0) to MINU(x, 1) and (x == 0) to (MINU(x, 1) ^ 1). /* example */ int test0(int x) { retur

[PATCH] xtensa: Use HARD_REG_SET instead of bare integer

2023-07-03 Thread Takayuki 'January June' Suwa via Gcc-patches
gcc/ChangeLog: * config/xtensa/xtensa.cc (machine_function, xtensa_expand_prologue): Change to use HARD_REG_BIT and its macros. * config/xtensa/xtensa.md (peephole2: regmove elimination during DFmode input reload): Likewise. --- gcc/config/xtensa/xtensa.cc

[PATCH 1/2] xtensa: Fix missing mode warning in "*eqne_INT_MIN"

2023-07-01 Thread Takayuki 'January June' Suwa via Gcc-patches
gcc/ChangeLog: * config/xtensa/xtensa.md (*eqne_INT_MIN): Add missing ":SI" to the match_operator. --- gcc/config/xtensa/xtensa.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md index 4b4ab3f5f37..b1af0

[PATCH 2/2] xtensa: The use of CLAMPS instruction also requires TARGET_MINMAX, as well as TARGET_CLAMPS

2023-07-01 Thread Takayuki 'January June' Suwa via Gcc-patches
Because both smin and smax requiring TARGET_MINMAX are essential to the RTL representation. gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_match_CLAMPS_imms_p): Simplify. * config/xtensa/xtensa.md (*xtensa_clamps): Add TARGET_MINMAX to the condition. --- gcc/con

[PATCH 1/2] xtensa: Remove TARGET_MEMORY_MOVE_COST hook

2023-06-18 Thread Takayuki 'January June' Suwa via Gcc-patches
It used to always return a constant 4, which is same as the default behavior, but doesn't take into account the effects of secondary reloads. Therefore, the implementation of this target hook is removed. gcc/ChangeLog: * config/xtensa/xtensa.cc (TARGET_MEMORY_MOVE_COST, xtensa_me

[PATCH 2/2] xtensa: constantsynth: Add new 2-insns synthesis pattern

2023-06-18 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch adds a new 2-instructions constant synthesis pattern: - A non-negative square value that root can fit into a signed 12-bit: => "MOVI(.N) Ax, simm12" + "MULL Ax, Ax, Ax" Due to the execution cost of the integer multiply instruction (MULL), this synthesis works only when the 32-bit

Re: [PATCH v2] xtensa: Optimize boolean evaluation or branching when EQ/NE to zero in S[IF]mode

2023-06-05 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/06/06 0:15, Max Filippov wrote: > Hi Suwa-san, Hi! Thanks for your regtest every time. > > On Mon, Jun 5, 2023 at 2:37 AM Takayuki 'January June' Suwa > wrote: >> >> This patch optimizes the boolean evaluation of EQ/NE against zero >> by addin

[PATCH v2] xtensa: Optimize boolean evaluation or branching when EQ/NE to zero in S[IF]mode

2023-06-05 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch optimizes the boolean evaluation of EQ/NE against zero by adding two insn_and_split patterns similar to SImode conditional store: "eq_zero": op0 = (op1 == 0) ? 1 : 0; op0 = clz(op1) >> 5; /* optimized (requires TARGET_NSA) */ "movsicc_ne0_reg_0": op0 = (op1 !=

[PATCH] xtensa: Optimize boolean evaluation or branching when EQ/NE to INT_MIN

2023-06-03 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch optimizes both the boolean evaluation of and the branching of EQ/NE against INT_MIN (-2147483648), by taking advantage of the specifi- cation the ABS machine instruction on Xtensa returns INT_MIN iff INT_MIN, otherwise non-negative value. /* example */ int test0(int x) { r

[PATCH] xtensa: Optimize boolean evaluation or branching when EQ/NE to zero in S[IF]mode

2023-06-03 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch optimizes the boolean evaluation of EQ/NE against zero by adding two insn_and_split patterns similar to SImode conditional store: "eq_zero": op0 = (op1 == 0) ? 1 : 0; op0 = clz(op1) >> 5; /* optimized (requires TARGET_NSA) */ "movsicc_ne0_reg_0": op0 = (op1 !=

Re: [PATCH 2/3 v3] xtensa: Add 'adddi3' and 'subdi3' insn patterns

2023-06-01 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/06/01 23:20, Max Filippov wrote: > On Wed, May 31, 2023 at 11:01 PM Takayuki 'January June' Suwa > wrote: >> More optimized than the default RTL generation. >> >> gcc/ChangeLog: >> >> * config/xtensa/xtensa.md (adddi3, subdi3): >

[PATCH 2/3 v3] xtensa: Add 'adddi3' and 'subdi3' insn patterns

2023-05-31 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/05/31 15:02, Max Filippov wrote: Hi! > On Tue, May 30, 2023 at 2:50 AM Takayuki 'January June' Suwa > wrote: >> >> Resubmitting the correct one due to a mistake in merging order of fixes. >> --- >> More optimized than the def

[PATCH 3/3 v2] xtensa: Optimize 'cstoresi4' insn pattern

2023-05-30 Thread Takayuki 'January June' Suwa via Gcc-patches
Resubmitting the correct one due to a mistake in merging order of fixes. --- This patch introduces more optimized implementations for the 6 cstoresi4 insn comparison methods (eq/ne/lt/le/gt/ge, however, required TARGET_NSA for eq). gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_expand_s

[PATCH 2/3 v2] xtensa: Add 'adddi3' and 'subdi3' insn patterns

2023-05-30 Thread Takayuki 'January June' Suwa via Gcc-patches
Resubmitting the correct one due to a mistake in merging order of fixes. --- More optimized than the default RTL generation. gcc/ChangeLog: * config/xtensa/xtensa.md (adddi3, subdi3): New RTL generation patterns implemented according to the instruc- tion idioms described i

[PATCH 3/3] xtensa: Optimize 'cstoresi4' insn pattern

2023-05-30 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch introduces more optimized implementations for the 6 cstoresi4 insn comparison methods (eq/ne/lt/le/gt/ge, however, required TARGET_NSA for eq). gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_expand_scc): Add dedicated optimization code for cstoresi4 (eq/ne/gt/ge/lt/le

[PATCH 2/3] xtensa: Add 'adddi3' and 'subdi3' insn patterns

2023-05-30 Thread Takayuki 'January June' Suwa via Gcc-patches
More optimized than the default RTL generation. gcc/ChangeLog: * config/xtensa/xtensa.md (adddi3, subdi3): New RTL generation patterns implemented according to the instruc- tion idioms described in the Xtensa ISA reference manual (p. 600). --- gcc/config/xtensa/xtensa.md

[PATCH 1/3] xtensa: Improve "*shlrd_reg" insn pattern and its variant

2023-05-30 Thread Takayuki 'January June' Suwa via Gcc-patches
The insn "*shlrd_reg" shifts two registers with a funnel shifter by the third register to get a single word result: reg0 = (reg1 SHIFT_OP0 reg3) BIT_JOIN_OP (reg2 SHIFT_OP1 (32 - reg3)) where the funnel left shift is SHIFT_OP0 := ASHIFT, SHIFT_OP1 := LSHIFTRT and its right shift is SHIFT_OP0 :=

[PATCH 3/3] xtensa: Rework 'setmemsi' insn pattern

2023-05-25 Thread Takayuki 'January June' Suwa via Gcc-patches
In order to reject voodoo estimation logic with lots of magic numbers, this patch revises the code to measure the costs of the three memset methods based on the actual emission size of the insn sequence corresponding to each method and choose the smallest one. gcc/ChangeLog: * config/xten

[PATCH 1/3] xtensa: Addendum of the commit e33d2dcb463161a110ac345a451132ce8b2b23d9

2023-05-25 Thread Takayuki 'January June' Suwa via Gcc-patches
gcc/ChangeLog: * config/xtensa/xtensa.md (*extzvsi-1bit_ashlsi3): Retract excessive line folding, and correct the value of the "length" insn attribute related to TARGET_DENSITY. (*extzvsi-1bit_addsubx): Ditto. --- gcc/config/xtensa/xtensa.md | 11 ++- 1 fil

[PATCH 2/3] xtensa: Add 'subtraction from constant' insn pattern

2023-05-25 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch makes try to eliminate using temporary pseudo for '(minus:SI (const_int) (reg:SI))' if the addition of negative constant value can be emitted in a single machine instruction. /* example */ int test0(int x) { return 1 - x; } int test1(int x) { return 100 - x;

[PATCH v2] xtensa: Optimize '(x & CST1_POW2) != 0 ? CST2_POW2 : 0'

2023-05-22 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/05/23 11:27, Max Filippov wrote: > Hi Suwa-san, Hi! > This change introduces a bunch of test failures on big endian configuration. > I believe that's because the starting bit position for zero_extract is counted > from different ends depending on the endianness. Oops, what a stupid mista

[PATCH 1/2] xtensa: Optimize '(x & CST1_POW2) != 0 ? CST2_POW2 : 0'

2023-05-22 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch decreses one machine instruction from "single bit extraction with shifting" operation, and tries to eliminate the conditional branch if CST2_POW2 doesn't fit into signed 12 bits with the help of ifcvt optimization. /* example #1 */ int test0(int x) { return (x & 1048576) !

[PATCH 2/2] xtensa: Merge '*addx' and '*subx' insn patterns into one

2023-05-22 Thread Takayuki 'January June' Suwa via Gcc-patches
By making use of the 'addsub_operator' added in the last patch. gcc/ChangeLog: * config/xtensa/xtensa.md (*addsubx): Rename from '*addx', and change to also accept '*subx' pattern. (*subx): Remove. --- gcc/config/xtensa/xtensa.md | 31 +-- 1 fi

[PATCH v2] xtensa: Make full transition to LRA

2023-05-08 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/05/08 22:43, Richard Biener wrote: [snip] >> -mlra > > If they were in any released compiler options should be kept > (doing nothing) for backward compatibility. Use for example > > mlra > Target WarnRemoved > Removed in GCC 14. This switch has no effect. > > or > > mlra > Target Igno

[PATCH] xtensa: Make full transition to LRA

2023-05-08 Thread Takayuki 'January June' Suwa via Gcc-patches
gcc/ChangeLog: * config/xtensa/constraints.md (R, T, U): Change define_constraint to define_memory_constraint. * config/xtensa/xtensa.cc (xtensa_lra_p, TARGET_LRA_P): Remove. (xtensa_emit_move_sequence): Remove "if (reload_in_progress)" clause as it

[PATCH] xtensa: Remove REG_OK_STRICT and its derivatives

2023-03-12 Thread Takayuki 'January June' Suwa via Gcc-patches
Because GO_IF_LEGITIMATE_ADDRESS was deprecated a long time ago (see commit c6c3dba931548987c78719180e30ebc863404b89). gcc/ChangeLog: * config/xtensa/xtensa.h (REG_OK_STRICT, REG_OK_FOR_INDEX_P, REG_OK_FOR_BASE_P): Remove. --- gcc/config/xtensa/xtensa.h | 11 +-- 1 file c

[PATCH] xtensa: Fix for enabling LRA

2023-03-07 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch makes LRA well with some exceptions (e.g. MI thunk generation due to pretending reload_completed). gcc/ChangeLog: * config/xtensa/constraints.md (R, T, U): Change define_constraint to define_memory_constraint. * config/xtensa/xtensa.cc (xtensa_legitimate_constan

[PATCH] xtensa: Make use of CLAMPS instruction if configured

2023-02-26 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch introduces the use of CLAMPS instruction when the instruction is configured. /* example */ int test(int a) { if (a < -512) return -512; if (a > 511) return 511; return a; } ;; prereq: TARGET_CLAMPS test: clamps a2, a2, 9

Re: [PATCH] gcc: xtensa: fix PR target/108919

2023-02-25 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
Hello, Max: On 2023/02/25 19:01, Max Filippov wrote: > gcc/ > PR target/108919 > > * config/xtensa/xtensa-protos.h > (xtensa_prepare_expand_call): Rename to xtensa_expand_call. > * config/xtensa/xtensa.cc (xtensa_prepare_expand_call): Rename > to xtensa_expa

[PATCH 2/2] xtensa: Fix missing mode warnings in machine description

2023-02-22 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
gcc/ChangeLog: * config/xtensa/xtensa.md (zero_cost_loop_start, zero_cost_loop_end, loop_end): Add missing "SI:" to PLUS RTXes. --- gcc/config/xtensa/xtensa.md | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/gcc/config/xtensa/xtensa.md b/gc

[PATCH 1/2] xtensa: Fix non-fatal regression introduced by b2ef02e8cbbaf95fee98be255f697f47193960ec

2023-02-22 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
In commit b2ef02e8cbbaf95fee98be255f697f47193960ec, the sibling call insn included (use (reg:SI A0_REG)) to fix the problem, which added a USE chain unconditionally to the data flow of register A0 during the sibling call. As a result, df_regs_ever_live_p (A0_REG) returns true, so even if register

[PATCH] xtensa: Enforce return address saving when -Og is specified

2023-02-17 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
Leaf function often omits saving its return address to the stack slot, and this feature often makes debugging very confusing, especially for stack dump analysis. gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_call_save_reg): Change to return true if register A0 (return address r

[PATCH v5] xtensa: Eliminate unnecessary general-purpose reg-reg moves

2023-02-17 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
Register-register move instructions that can be easily seen as unnecessary by the human eye may remain in the compiled result. For example: /* example */ double test(double a, double b) { return __builtin_copysign(a, b); } test: add.n a3, a3, a3 extui a5, a5, 31, 1 s

[PATCH v7] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-02-16 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
In the case of the CALL0 ABI, values that must be retained before and after function calls are placed in the callee-saved registers (A12 through A15) and referenced later. However, it is often the case that the save and the reference are each only once and a simple register- register move (with tw

Re: [PATCH v6] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-02-16 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
On 2023/02/16 7:18, Max Filippov wrote: > Hi Suwa-san, Hi! > > On Thu, Jan 26, 2023 at 7:17 PM Takayuki 'January June' Suwa > wrote: >> >> In the case of the CALL0 ABI, values that must be retained before and >> after function calls are placed in the ca

[PATCH v6] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-01-26 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
In the case of the CALL0 ABI, values that must be retained before and after function calls are placed in the callee-saved registers (A12 through A15) and referenced later. However, it is often the case that the save and the reference are each only once and a simple register- register move (with tw

[PATCH v4] xtensa: Eliminate unnecessary general-purpose reg-reg moves

2023-01-23 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
Register-register move instructions that can be easily seen as unnecessary by the human eye may remain in the compiled result. For example: /* example */ double test(double a, double b) { return __builtin_copysign(a, b); } test: add.n a3, a3, a3 extui a5, a5, 31, 1 s

[PATCH v5] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-01-23 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
In the case of the CALL0 ABI, values that must be retained before and after function calls are placed in the callee-saved registers (A12 through A15) and referenced later. However, it is often the case that the save and the reference are each only once and a simple register- register move (with tw

Re: [PATCH v4] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-01-22 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
On 2023/01/23 0:45, Max Filippov wrote: > On Fri, Jan 20, 2023 at 8:39 PM Takayuki 'January June' Suwa > wrote: >> On 2023/01/21 0:14, Max Filippov wrote: >>> After having this many attempts and getting to the issues that are >>> really hard to detect I w

Re: [PATCH v4] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-01-20 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
On 2023/01/21 0:14, Max Filippov wrote: > Hi Suwa-san, Hi! > > On Wed, Jan 18, 2023 at 7:50 PM Takayuki 'January June' Suwa > wrote: >> >> In the previous patch, if insn is JUMP_INSN or CALL_INSN, it bypasses the >> reg check (possibly FAIL). >> &g

[PATCH] xtensa: Revise 89afb2e86fcb29c559b2957fdcbea0d01740c49b

2023-01-19 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
In the previously posted patch "xtensa: Make complex hard register clobber elimination more robust and accurate", the check code for insns that refer to the [DS]Cmode hard register before it is overwritten after it is clobbered is incomplete. Fortunately such insns are seldom emitted, so it didn'

[PATCH v3] xtensa: Eliminate unnecessary general-purpose reg-reg moves

2023-01-18 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
Register-register move instructions that can be easily seen as unnecessary by the human eye may remain in the compiled result. For example: /* example */ double test(double a, double b) { return __builtin_copysign(a, b); } test: add.n a3, a3, a3 extui a5, a5, 31, 1 s

[PATCH v4] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-01-18 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
In the previous patch, if insn is JUMP_INSN or CALL_INSN, it bypasses the reg check (possibly FAIL). = In the case of the CALL0 ABI, values that must be retained before and after function calls are placed in the callee-saved registers (A12 through A15) and referenced later. However, it is of

[PATCH] xtensa: Optimize inversion of the MSB

2023-01-17 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
Such operation can be done either bitwise-XOR or addition with -2147483648, but the latter is one byte less if TARGET_DENSITY. gcc/ChangeLog: * config/xtensa/xtensa.md (xorsi3_internal): Rename from the original of "xorsi3". (xorsi3): New expansion pattern that emits addit

[PATCH v2] xtensa: Eliminate unnecessary general-purpose reg-reg moves

2023-01-17 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
Register-register move instructions that can be easily seen as unnecessary by the human eye may remain in the compiled result. For example: /* example */ double test(double a, double b) { return __builtin_copysign(a, b); } test: add.n a3, a3, a3 extui a5, a5, 31, 1 s

[PATCH v3] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-01-17 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
On 2023/01/17 20:23, Max Filippov wrote: > Hi Suwa-san, Hi! > There's still a few regressions in tests with -fcompare-debug because > code generated with -g and without it is different: > E.g. check the following test with -g0 and -g: Again debug_insn is the problem... = In the case of the CA

[PATCH] xtensa: Eliminate unnecessary general-purpose reg-reg moves

2023-01-16 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
Register-register move instructions that can be easily seen as unnecessary by the human eye may remain in the compiled result. For example: /* example */ double test(double a, double b) { return __builtin_copysign(a, b); } test: add.n a3, a3, a3 extui a5, a5, 31, 1 s

[PATCH v2] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-01-16 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
In the case of the CALL0 ABI, values that must be retained before and after function calls are placed in the callee-saved registers (A12 through A15) and referenced later. However, it is often the case that the save and the reference are each only once and a simple register- register move (the fra

[PATCH] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-01-15 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
In the case of the CALL0 ABI, values that must be retained before and after function calls are placed in the callee-saved registers (A12 through A15) and referenced later. However, it is often the case that the save and the reference are each only once and a simple register- register move. e.g. i

[PATCH] xtensa: Remove old broken tweak for leaf function

2023-01-13 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
In the before-IRA era, ORDER_REGS_FOR_LOCAL_ALLOC was called for each function in Xtensa, and there was register allocation table reordering for leaf functions to compensate for the poor performance of local-alloc. Today the adjustment hook is still called via its alternative ADJUST_REG_ALLOC_ORDE

[PATCH 2/2] xtensa: Optimize ctzsi2 and ffssi2 a bit

2023-01-11 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
This patch saves one byte when the Code Density Option is enabled, gcc/ChangeLog: * config/xtensa/xtensa.md (ctzsi2, ffssi2): Rearrange the emitting codes. --- gcc/config/xtensa/xtensa.md | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/gcc/config/xtens

[PATCH 1/2] xtensa: Tune "*btrue" insn pattern

2023-01-11 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
This branch instruction has short encoding if EQ/NE comparison against immediate zero when the Code Density Option is enabled, but its "length" attribute was only for normal encoding. This patch fixes it. This patch also prevents undesireable replacement the comparison immediate zero of the instr

Re: [PATCH] ifcvt.cc: Prevent excessive if-conversion for conditional moves

2023-01-11 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
On 2023/01/11 17:02, Robin Dapp wrote: > Hi, Hi! > >> On optimizing for speed, default_noce_conversion_profitable_p() allows >> plenty of headroom, so this patch has little impact. >> >> Also, if the target-specific cost estimate is accurate or allows for >> margins, the impact should be similar

[PATCH] ifcvt.cc: Prevent excessive if-conversion for conditional moves

2023-01-10 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
Currently, cond_move_process_if_block() does the conversion without balancing the cost of the converted sequence with the original one, but this should be checked by calling targetm.noce_conversion_profitable_p(). Doing so allows us to provide a way based on the target-specific cost estimate, to p

[PATCH] xtensa: Make instruction cost estimation for size more accurate

2023-01-09 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
Until now, we applied COSTS_N_INSNS() (multiplying by 4) after dividing the instruction length by 3, so we couldn't express the difference less than modulo 3 in insn cost for size (e.g. 11 Bytes and 12 bytes cost the same). This patch fixes that. ;; 2 bytes addi.n a2, a2, -1 ; cost 3 ;; 3

[PATCH v2] xtensa: Optimize bitwise splicing operation

2023-01-07 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
This patch optimizes the operation of cutting and splicing two register values at a specified bit position, in other words, combining (bitwise ORing) bits 0 through (C-1) of the register with bits C through 31 of the other, where C is the specified immediate integer 17 through 31. This typically a

Re: [PATCH] xtensa: Optimize bitwise splicing operation

2023-01-07 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
On 2023/01/08 6:53, Max Filippov wrote: > On Fri, Jan 6, 2023 at 6:55 PM Takayuki 'January June' Suwa > wrote: >> >> This patch optimizes the operation of cutting and splicing two register >> values at a specified bit position, in other words, combining (bitwise

[PATCH] xtensa: Optimize bitwise splicing operation

2023-01-06 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
This patch optimizes the operation of cutting and splicing two register values at a specified bit position, in other words, combining (bitwise ORing) bits 0 through (C-1) of the register with bits C through 31 of the other, where C is the specified immediate integer 1 through 31. This typically ap

[PATCH v2] xtensa: Optimize stack frame adjustment more

2023-01-06 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
This patch introduces a convenient helper function for integer immediate addition with scratch register as needed, that splits and emits either up to two ADDI/ADDMI machine instructions or an addition by register following an integer immediate load (which may later be transformed by constantsynth).

Re: [PATCH] xtensa: Optimize stack frame adjustment more

2023-01-06 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
On 2023/01/06 17:05, Max Filippov wrote: > On Thu, Jan 5, 2023 at 10:57 PM Takayuki 'January June' Suwa > wrote: >> By using the helper function, it makes stack frame adjustment logic >> simplified and instruction count less in some cases. > > I've built

Re: [PATCH] xtensa: Optimize stack frame adjustment more

2023-01-05 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
On 2023/01/06 15:26, Max Filippov wrote: > On Thu, Jan 5, 2023 at 7:35 PM Takayuki 'January June' Suwa > wrote: >> On second thought, it cannot be a good idea to split addition/subtraction to >> the stack pointer. >> >>> -4aaf: b0a192

Re: [PATCH] xtensa: Optimize stack frame adjustment more

2023-01-05 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
On 2023/01/06 6:32, Max Filippov wrote: > Hi Suwa-san, Hi! > > On Thu, Jan 5, 2023 at 3:57 AM Takayuki 'January June' Suwa > wrote: >> >> This patch introduces a convenient helper function for integer immediate >> addition with scratch register as neede

[PATCH] xtensa: Optimize stack frame adjustment more

2023-01-05 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
This patch introduces a convenient helper function for integer immediate addition with scratch register as needed, that splits and emits either up to two ADDI/ADDMI machine instructions or an addition by register following an immediate integer load (which may later be transformed by constantsynth).

[PATCH] xtensa: Check DF availability before use

2022-12-29 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
Parhaps no problem, but for safety. gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_expand_prologue): Fix to check DF availability before use of DF_* macros. --- gcc/config/xtensa/xtensa.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/xtensa/xte

[PATCH] xtensa: Apply a few minor fixes

2022-12-26 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
Almost cosmetic and no functional changes. gcc/ChangeLog: * config/xtensa/*: Tabify, and trim trailing spaces. * config/xtensa/xtensa.h (GP_RETURN, GP_RETURN_REG_COUNT): Change to GP_RETURN_FIRST and GP_RETURN_LAST, respectively. * config/xtensa/xtensa.cc (xtensa_f

Re: [PATCH 2/2] xtensa: Implement new target hook: TARGET_CONSTANT_OK_FOR_CPROP_P

2022-09-12 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
On 2022/09/13 4:34, Max Filippov wrote: Hi! > On Sun, Sep 11, 2022 at 1:50 PM Takayuki 'January June' Suwa > wrote: >> >> This patch implements new target hook TARGET_CONSTANT_OK_FOR_CPROP_P in >> order to exclude CONST_INTs that cannot fit into a MOVI

[PATCH 2/2] xtensa: Implement new target hook: TARGET_CONSTANT_OK_FOR_CPROP_P

2022-09-11 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
This patch implements new target hook TARGET_CONSTANT_OK_FOR_CPROP_P in order to exclude CONST_INTs that cannot fit into a MOVI machine instruction from cprop. gcc/ChangeLog: * config/xtensa/xtensa.c (TARGET_CONSTANT_OK_FOR_CPROP_P): New macro definition. (xtensa_constant_

[PATCH 1/2] Add new target hook: constant_ok_for_cprop_p

2022-09-11 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
Hi, Many RISC machines, as we know, have some restrictions on placing register-width constants in the source of load-immediate machine instructions, so the target must provide a solution for that in the machine description. A naive way would be to solve it early, ie. to replace with read consta

[PATCH] xtensa: constantsynth: Add new 3-insns synthesis pattern

2022-09-10 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
This patch adds a new 3-instructions constant synthesis pattern: - A value that can fit into a signed 12-bit after a number of either bitwise left or right rotations: => "MOVI(.N) Ax, simm12" + "SSAI (1 ... 11) or (21 ... 31)" + "SRC Ax, Ax, Ax" gcc/ChangeLog: * config/xten

[PATCH v4 1/2] xtensa: Eliminate unused stack frame allocation/freeing

2022-09-08 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
Changes from v3: (xtensa_expand_prologue): Changed to exclude debug insns from DF use chain analysis. --- In the example below, 'x' is once placed on the stack frame and then read into registers as the argument value of bar(): /* example */ struct foo { int a, b; }; exte

[PATCH v3 1/2] xtensa: Eliminate unused stack frame allocation/freeing

2022-09-07 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
Changes from v2: (xtensa_expand_prologue): Changed to check conditions for suppressing emit insns in advance, instead of tracking emitted and later replacing them with NOPs if they are found to be unnecessary. --- In the example below, 'x' is once placed on the stack frame and then read into

[PATCH v2 1/2] xtensa: Eliminate unused stack frame allocation/freeing

2022-09-02 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
Changes from v1: (xtensa_expand_epilogue): Fixed forgetting to consider hard_frame_pointer_rtx when sharing codes. --- In the example below, 'x' is once placed on the stack frame and then read into registers as the argument value of bar(): /* example */ struct foo { int a, b;

[PATCH 1/2] xtensa: Eliminate unused stack frame allocation/freeing

2022-08-31 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
In the example below, 'x' is once placed on the stack frame and then read into registers as the argument value of bar(): /* example */ struct foo { int a, b; }; extern struct foo bar(struct foo); struct foo test(void) { struct foo x = { 0, 1 }; return bar(x);

[PATCH 2/2] xtensa: Make complex hard register clobber elimination more robust and accurate

2022-08-31 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
This patch eliminates all clobbers for complex hard registers that will be overwritten entirely afterwards (supersedence of 3867d414bd7d9e5b6fb2a51b1fb3d9e9e1eae9). gcc/ChangeLog: * config/xtensa/xtensa.md: Rewrite the split pattern that performs the abovementioned process so that

[PATCH] xtensa: Improve indirect sibling call handling

2022-08-18 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
No longer needs the dedicated hard register (A11) for the address of the call and the split patterns for fixups, due to the introduction of appropriate register class and constraint. (Note: "ISC_REGS" contains a hard register A8 used as a "static chain" pointer for nested functions, but no proble

[PATCH] xtensa: Optimize stack pointer updates in function pro/epilogue under certain conditions

2022-08-17 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
This patch enforces the use of "addmi" machine instruction instead of addition/subtraction with two source registers for adjusting the stack pointer, if the adjustment fits into a signed 16-bit and is also a multiple of 256. /* example */ void test(void) { char buffer[4096]; __

Re: [PATCH] xtensa: Prevent emitting integer additions of constant zero

2022-08-17 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
On 2022/08/17 4:58, Max Filippov wrote: > Hi Suwa-san, Hi! > > On Tue, Aug 16, 2022 at 5:42 AM Takayuki 'January June' Suwa > wrote: >> >> In a few cases, obviously omitable add instructions can be emitted via >> invoking gen_addsi3. >> >> gc

[PATCH] xtensa: Prevent emitting integer additions of constant zero

2022-08-16 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
In a few cases, obviously omitable add instructions can be emitted via invoking gen_addsi3. gcc/ChangeLog: * config/xtensa/xtensa.md (addsi3_internal): Rename from "addsi3". (addsi3): New define_expand in order to reject integer additions of constant zero. --- gcc/config/

[PATCH] xtensa: Turn on -fsplit-wide-types-early by default

2022-08-14 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
Since GCC10, the "subreg2" optimization pass was no longer tied to enabling "subreg1" unless -fsplit-wide-types-early was turned on (PR88233). However on the Xtensa port, the lack of "subreg2" can degrade the quality of the output code, especially for those that produce many D[FC]mode pseudos. Th

Re: [PATCH] lower-subreg, expr: Mitigate inefficiencies derived from "(clobber (reg X))" followed by "(set (subreg (reg X)) (...))"

2022-08-04 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
(sorry repost due to the lack of cc here) Hi! On 2022/08/04 18:49, Richard Sandiford wrote: > Takayuki 'January June' Suwa writes: >> Thanks for your response. >> >> On 2022/08/03 16:52, Richard Sandiford wrote: >>> Takayuki 'January June'

Re: [PATCH] lower-subreg, expr: Mitigate inefficiencies derived from "(clobber (reg X))" followed by "(set (subreg (reg X)) (...))"

2022-08-03 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
Thanks for your response. On 2022/08/03 16:52, Richard Sandiford wrote: > Takayuki 'January June' Suwa via Gcc-patches writes: >> Emitting "(clobber (reg X))" before "(set (subreg (reg X)) (...))" keeps >> data flow consistent, but it also increases r

[PATCH] lower-subreg, expr: Mitigate inefficiencies derived from "(clobber (reg X))" followed by "(set (subreg (reg X)) (...))"

2022-08-02 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
Emitting "(clobber (reg X))" before "(set (subreg (reg X)) (...))" keeps data flow consistent, but it also increases register allocation pressure and thus often creates many unwanted register-to-register moves that cannot be optimized away. It seems just analogous to partial register stall which i

[PATCH 2/2] xtensa: Fix conflicting hard regno between indirect sibcall fixups and EH_RETURN_STACKADJ_RTX

2022-07-29 Thread Takayuki &#x27;January June&#x27; Suwa via Gcc-patches
The hard register A10 was already allocated for EH_RETURN_STACKADJ_RTX. (although exception handling and sibling call may not apply at the same time, but for safety) gcc/ChangeLog: * config/xtensa/xtensa.md: Change hard register number used in the split patterns for indirect sibl

  1   2   >