Hi!
On 2023/09/07 23:22, Max Filippov wrote:
> gcc/
> * config/xtensa/predicates.md (xtensa_cstoresi_operator): Add
> unsigned comparisons.
> * config/xtensa/xtensa.cc (xtensa_expand_scc): Add code
> generation of salt/saltu instructions.
> * config/xtensa/xtensa.h (T
An idiomatic implementation of boolean evaluation of whether a register is
zero or not in Xtensa is to assign 0 and 1 to the temporary and destination,
and then issue the MOV[EQ/NE]Z machine instruction
(See 8.3.2 Instruction Idioms, Xtensa ISA refman., p.599):
;; A2 = (A3 != 0) ? 1 : 0;
m
On 2023/09/06 8:01, Max Filippov wrote:
> Hi Suwa-san,
Hi!
>
> On Tue, Sep 5, 2023 at 2:29 AM Takayuki 'January June' Suwa
> wrote:
>>
>> This patch optimizes the boolean evaluation for equality to 0 in SImode
>> using the MINU (Minimum Value Unsigned) machine instruction available
>> when TARGE
This patch optimizes the boolean evaluation for equality to 0 in SImode
using the MINU (Minimum Value Unsigned) machine instruction available
when TARGET_MINMAX is configured, for example, (x != 0) to MINU(x, 1)
and (x == 0) to (MINU(x, 1) ^ 1).
/* example */
int test0(int x) {
retur
gcc/ChangeLog:
* config/xtensa/xtensa.cc (machine_function, xtensa_expand_prologue):
Change to use HARD_REG_BIT and its macros.
* config/xtensa/xtensa.md
(peephole2: regmove elimination during DFmode input reload):
Likewise.
---
gcc/config/xtensa/xtensa.cc
gcc/ChangeLog:
* config/xtensa/xtensa.md (*eqne_INT_MIN):
Add missing ":SI" to the match_operator.
---
gcc/config/xtensa/xtensa.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
index 4b4ab3f5f37..b1af0
Because both smin and smax requiring TARGET_MINMAX are essential to the
RTL representation.
gcc/ChangeLog:
* config/xtensa/xtensa.cc (xtensa_match_CLAMPS_imms_p):
Simplify.
* config/xtensa/xtensa.md (*xtensa_clamps):
Add TARGET_MINMAX to the condition.
---
gcc/con
It used to always return a constant 4, which is same as the default
behavior, but doesn't take into account the effects of secondary
reloads.
Therefore, the implementation of this target hook is removed.
gcc/ChangeLog:
* config/xtensa/xtensa.cc
(TARGET_MEMORY_MOVE_COST, xtensa_me
This patch adds a new 2-instructions constant synthesis pattern:
- A non-negative square value that root can fit into a signed 12-bit:
=> "MOVI(.N) Ax, simm12" + "MULL Ax, Ax, Ax"
Due to the execution cost of the integer multiply instruction (MULL), this
synthesis works only when the 32-bit
On 2023/06/06 0:15, Max Filippov wrote:
> Hi Suwa-san,
Hi! Thanks for your regtest every time.
>
> On Mon, Jun 5, 2023 at 2:37 AM Takayuki 'January June' Suwa
> wrote:
>>
>> This patch optimizes the boolean evaluation of EQ/NE against zero
>> by adding two insn_and_split patterns similar to SIm
This patch optimizes the boolean evaluation of EQ/NE against zero
by adding two insn_and_split patterns similar to SImode conditional
store:
"eq_zero":
op0 = (op1 == 0) ? 1 : 0;
op0 = clz(op1) >> 5; /* optimized (requires TARGET_NSA) */
"movsicc_ne0_reg_0":
op0 = (op1 !=
This patch optimizes both the boolean evaluation of and the branching of
EQ/NE against INT_MIN (-2147483648), by taking advantage of the specifi-
cation the ABS machine instruction on Xtensa returns INT_MIN iff INT_MIN,
otherwise non-negative value.
/* example */
int test0(int x) {
r
This patch optimizes the boolean evaluation of EQ/NE against zero
by adding two insn_and_split patterns similar to SImode conditional
store:
"eq_zero":
op0 = (op1 == 0) ? 1 : 0;
op0 = clz(op1) >> 5; /* optimized (requires TARGET_NSA) */
"movsicc_ne0_reg_0":
op0 = (op1 !=
On 2023/06/01 23:20, Max Filippov wrote:
> On Wed, May 31, 2023 at 11:01 PM Takayuki 'January June' Suwa
> wrote:
>> More optimized than the default RTL generation.
>>
>> gcc/ChangeLog:
>>
>> * config/xtensa/xtensa.md (adddi3, subdi3):
>> New RTL generation patterns implemented acc
On 2023/05/31 15:02, Max Filippov wrote:
Hi!
> On Tue, May 30, 2023 at 2:50 AM Takayuki 'January June' Suwa
> wrote:
>>
>> Resubmitting the correct one due to a mistake in merging order of fixes.
>> ---
>> More optimized than the default RTL generation.
>>
>> gcc/ChangeLog:
>>
>> * config
Resubmitting the correct one due to a mistake in merging order of fixes.
---
This patch introduces more optimized implementations for the 6 cstoresi4
insn comparison methods (eq/ne/lt/le/gt/ge, however, required TARGET_NSA
for eq).
gcc/ChangeLog:
* config/xtensa/xtensa.cc (xtensa_expand_s
Resubmitting the correct one due to a mistake in merging order of fixes.
---
More optimized than the default RTL generation.
gcc/ChangeLog:
* config/xtensa/xtensa.md (adddi3, subdi3):
New RTL generation patterns implemented according to the instruc-
tion idioms described i
This patch introduces more optimized implementations for the 6 cstoresi4
insn comparison methods (eq/ne/lt/le/gt/ge, however, required TARGET_NSA
for eq).
gcc/ChangeLog:
* config/xtensa/xtensa.cc (xtensa_expand_scc):
Add dedicated optimization code for cstoresi4 (eq/ne/gt/ge/lt/le
More optimized than the default RTL generation.
gcc/ChangeLog:
* config/xtensa/xtensa.md (adddi3, subdi3):
New RTL generation patterns implemented according to the instruc-
tion idioms described in the Xtensa ISA reference manual (p. 600).
---
gcc/config/xtensa/xtensa.md
The insn "*shlrd_reg" shifts two registers with a funnel shifter by the
third register to get a single word result:
reg0 = (reg1 SHIFT_OP0 reg3) BIT_JOIN_OP (reg2 SHIFT_OP1 (32 - reg3))
where the funnel left shift is SHIFT_OP0 := ASHIFT, SHIFT_OP1 := LSHIFTRT
and its right shift is SHIFT_OP0 :=
In order to reject voodoo estimation logic with lots of magic numbers,
this patch revises the code to measure the costs of the three memset
methods based on the actual emission size of the insn sequence
corresponding to each method and choose the smallest one.
gcc/ChangeLog:
* config/xten
gcc/ChangeLog:
* config/xtensa/xtensa.md (*extzvsi-1bit_ashlsi3):
Retract excessive line folding, and correct the value of
the "length" insn attribute related to TARGET_DENSITY.
(*extzvsi-1bit_addsubx): Ditto.
---
gcc/config/xtensa/xtensa.md | 11 ++-
1 fil
This patch makes try to eliminate using temporary pseudo for
'(minus:SI (const_int) (reg:SI))' if the addition of negative constant
value can be emitted in a single machine instruction.
/* example */
int test0(int x) {
return 1 - x;
}
int test1(int x) {
return 100 - x;
On 2023/05/23 11:27, Max Filippov wrote:
> Hi Suwa-san,
Hi!
> This change introduces a bunch of test failures on big endian configuration.
> I believe that's because the starting bit position for zero_extract is counted
> from different ends depending on the endianness.
Oops, what a stupid mista
This patch decreses one machine instruction from "single bit extraction
with shifting" operation, and tries to eliminate the conditional
branch if CST2_POW2 doesn't fit into signed 12 bits with the help
of ifcvt optimization.
/* example #1 */
int test0(int x) {
return (x & 1048576) !
By making use of the 'addsub_operator' added in the last patch.
gcc/ChangeLog:
* config/xtensa/xtensa.md (*addsubx): Rename from '*addx',
and change to also accept '*subx' pattern.
(*subx): Remove.
---
gcc/config/xtensa/xtensa.md | 31 +--
1 fi
On 2023/05/08 22:43, Richard Biener wrote:
[snip]
>> -mlra
>
> If they were in any released compiler options should be kept
> (doing nothing) for backward compatibility. Use for example
>
> mlra
> Target WarnRemoved
> Removed in GCC 14. This switch has no effect.
>
> or
>
> mlra
> Target Igno
gcc/ChangeLog:
* config/xtensa/constraints.md (R, T, U):
Change define_constraint to define_memory_constraint.
* config/xtensa/xtensa.cc
(xtensa_lra_p, TARGET_LRA_P): Remove.
(xtensa_emit_move_sequence): Remove "if (reload_in_progress)"
clause as it
Because GO_IF_LEGITIMATE_ADDRESS was deprecated a long time ago
(see commit c6c3dba931548987c78719180e30ebc863404b89).
gcc/ChangeLog:
* config/xtensa/xtensa.h (REG_OK_STRICT, REG_OK_FOR_INDEX_P,
REG_OK_FOR_BASE_P): Remove.
---
gcc/config/xtensa/xtensa.h | 11 +--
1 file c
This patch makes LRA well with some exceptions
(e.g. MI thunk generation due to pretending reload_completed).
gcc/ChangeLog:
* config/xtensa/constraints.md (R, T, U):
Change define_constraint to define_memory_constraint.
* config/xtensa/xtensa.cc (xtensa_legitimate_constan
This patch introduces the use of CLAMPS instruction when the instruction
is configured.
/* example */
int test(int a) {
if (a < -512)
return -512;
if (a > 511)
return 511;
return a;
}
;; prereq: TARGET_CLAMPS
test:
clamps a2, a2, 9
Hello, Max:
On 2023/02/25 19:01, Max Filippov wrote:
> gcc/
> PR target/108919
>
> * config/xtensa/xtensa-protos.h
> (xtensa_prepare_expand_call): Rename to xtensa_expand_call.
> * config/xtensa/xtensa.cc (xtensa_prepare_expand_call): Rename
> to xtensa_expa
gcc/ChangeLog:
* config/xtensa/xtensa.md
(zero_cost_loop_start, zero_cost_loop_end, loop_end):
Add missing "SI:" to PLUS RTXes.
---
gcc/config/xtensa/xtensa.md | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/gcc/config/xtensa/xtensa.md b/gc
In commit b2ef02e8cbbaf95fee98be255f697f47193960ec, the sibling call
insn included (use (reg:SI A0_REG)) to fix the problem, which added
a USE chain unconditionally to the data flow of register A0 during
the sibling call.
As a result, df_regs_ever_live_p (A0_REG) returns true, so even if
register
Leaf function often omits saving its return address to the stack slot,
and this feature often makes debugging very confusing, especially for
stack dump analysis.
gcc/ChangeLog:
* config/xtensa/xtensa.cc (xtensa_call_save_reg): Change to return
true if register A0 (return address r
Register-register move instructions that can be easily seen as
unnecessary by the human eye may remain in the compiled result.
For example:
/* example */
double test(double a, double b) {
return __builtin_copysign(a, b);
}
test:
add.n a3, a3, a3
extui a5, a5, 31, 1
s
In the case of the CALL0 ABI, values that must be retained before and
after function calls are placed in the callee-saved registers (A12
through A15) and referenced later. However, it is often the case that
the save and the reference are each only once and a simple register-
register move (with tw
On 2023/02/16 7:18, Max Filippov wrote:
> Hi Suwa-san,
Hi!
>
> On Thu, Jan 26, 2023 at 7:17 PM Takayuki 'January June' Suwa
> wrote:
>>
>> In the case of the CALL0 ABI, values that must be retained before and
>> after function calls are placed in the callee-saved registers (A12
>> through A15)
In the case of the CALL0 ABI, values that must be retained before and
after function calls are placed in the callee-saved registers (A12
through A15) and referenced later. However, it is often the case that
the save and the reference are each only once and a simple register-
register move (with tw
Register-register move instructions that can be easily seen as
unnecessary by the human eye may remain in the compiled result.
For example:
/* example */
double test(double a, double b) {
return __builtin_copysign(a, b);
}
test:
add.n a3, a3, a3
extui a5, a5, 31, 1
s
In the case of the CALL0 ABI, values that must be retained before and
after function calls are placed in the callee-saved registers (A12
through A15) and referenced later. However, it is often the case that
the save and the reference are each only once and a simple register-
register move (with tw
On 2023/01/23 0:45, Max Filippov wrote:
> On Fri, Jan 20, 2023 at 8:39 PM Takayuki 'January June' Suwa
> wrote:
>> On 2023/01/21 0:14, Max Filippov wrote:
>>> After having this many attempts and getting to the issues that are
>>> really hard to detect I wonder if the target backend is the right pl
On 2023/01/21 0:14, Max Filippov wrote:
> Hi Suwa-san,
Hi!
>
> On Wed, Jan 18, 2023 at 7:50 PM Takayuki 'January June' Suwa
> wrote:
>>
>> In the previous patch, if insn is JUMP_INSN or CALL_INSN, it bypasses the
>> reg check (possibly FAIL).
>>
>> =
>> In the case of the CALL0 ABI, values
In the previously posted patch
"xtensa: Make complex hard register clobber elimination more robust and
accurate",
the check code for insns that refer to the [DS]Cmode hard register before
it is overwritten after it is clobbered is incomplete. Fortunately such
insns are seldom emitted, so it didn'
Register-register move instructions that can be easily seen as
unnecessary by the human eye may remain in the compiled result.
For example:
/* example */
double test(double a, double b) {
return __builtin_copysign(a, b);
}
test:
add.n a3, a3, a3
extui a5, a5, 31, 1
s
In the previous patch, if insn is JUMP_INSN or CALL_INSN, it bypasses the reg
check (possibly FAIL).
=
In the case of the CALL0 ABI, values that must be retained before and
after function calls are placed in the callee-saved registers (A12
through A15) and referenced later. However, it is of
Such operation can be done either bitwise-XOR or addition with -2147483648,
but the latter is one byte less if TARGET_DENSITY.
gcc/ChangeLog:
* config/xtensa/xtensa.md (xorsi3_internal):
Rename from the original of "xorsi3".
(xorsi3): New expansion pattern that emits addit
Register-register move instructions that can be easily seen as
unnecessary by the human eye may remain in the compiled result.
For example:
/* example */
double test(double a, double b) {
return __builtin_copysign(a, b);
}
test:
add.n a3, a3, a3
extui a5, a5, 31, 1
s
On 2023/01/17 20:23, Max Filippov wrote:
> Hi Suwa-san,
Hi!
> There's still a few regressions in tests with -fcompare-debug because
> code generated with -g and without it is different:
> E.g. check the following test with -g0 and -g:
Again debug_insn is the problem...
=
In the case of the CA
Register-register move instructions that can be easily seen as
unnecessary by the human eye may remain in the compiled result.
For example:
/* example */
double test(double a, double b) {
return __builtin_copysign(a, b);
}
test:
add.n a3, a3, a3
extui a5, a5, 31, 1
s
In the case of the CALL0 ABI, values that must be retained before and
after function calls are placed in the callee-saved registers (A12
through A15) and referenced later. However, it is often the case that
the save and the reference are each only once and a simple register-
register move (the fra
In the case of the CALL0 ABI, values that must be retained before and
after function calls are placed in the callee-saved registers (A12
through A15) and referenced later. However, it is often the case that
the save and the reference are each only once and a simple register-
register move.
e.g. i
In the before-IRA era, ORDER_REGS_FOR_LOCAL_ALLOC was called for each
function in Xtensa, and there was register allocation table reordering
for leaf functions to compensate for the poor performance of local-alloc.
Today the adjustment hook is still called via its alternative
ADJUST_REG_ALLOC_ORDE
This patch saves one byte when the Code Density Option is enabled,
gcc/ChangeLog:
* config/xtensa/xtensa.md (ctzsi2, ffssi2):
Rearrange the emitting codes.
---
gcc/config/xtensa/xtensa.md | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/gcc/config/xtens
This branch instruction has short encoding if EQ/NE comparison against
immediate zero when the Code Density Option is enabled, but its "length"
attribute was only for normal encoding. This patch fixes it.
This patch also prevents undesireable replacement the comparison immediate
zero of the instr
On 2023/01/11 17:02, Robin Dapp wrote:
> Hi,
Hi!
>
>> On optimizing for speed, default_noce_conversion_profitable_p() allows
>> plenty of headroom, so this patch has little impact.
>>
>> Also, if the target-specific cost estimate is accurate or allows for
>> margins, the impact should be similar
Currently, cond_move_process_if_block() does the conversion without
balancing the cost of the converted sequence with the original one, but
this should be checked by calling targetm.noce_conversion_profitable_p().
Doing so allows us to provide a way based on the target-specific cost
estimate, to p
Until now, we applied COSTS_N_INSNS() (multiplying by 4) after dividing
the instruction length by 3, so we couldn't express the difference less
than modulo 3 in insn cost for size (e.g. 11 Bytes and 12 bytes cost the
same).
This patch fixes that.
;; 2 bytes
addi.n a2, a2, -1 ; cost 3
;; 3
This patch optimizes the operation of cutting and splicing two register
values at a specified bit position, in other words, combining (bitwise
ORing) bits 0 through (C-1) of the register with bits C through 31
of the other, where C is the specified immediate integer 17 through 31.
This typically a
On 2023/01/08 6:53, Max Filippov wrote:
> On Fri, Jan 6, 2023 at 6:55 PM Takayuki 'January June' Suwa
> wrote:
>>
>> This patch optimizes the operation of cutting and splicing two register
>> values at a specified bit position, in other words, combining (bitwise
>> ORing) bits 0 through (C-1) of t
This patch optimizes the operation of cutting and splicing two register
values at a specified bit position, in other words, combining (bitwise
ORing) bits 0 through (C-1) of the register with bits C through 31
of the other, where C is the specified immediate integer 1 through 31.
This typically ap
This patch introduces a convenient helper function for integer immediate
addition with scratch register as needed, that splits and emits either
up to two ADDI/ADDMI machine instructions or an addition by register
following an integer immediate load (which may later be transformed by
constantsynth).
On 2023/01/06 17:05, Max Filippov wrote:
> On Thu, Jan 5, 2023 at 10:57 PM Takayuki 'January June' Suwa
> wrote:
>> By using the helper function, it makes stack frame adjustment logic
>> simplified and instruction count less in some cases.
>
> I've built a couple linux configurations with and wit
On 2023/01/06 15:26, Max Filippov wrote:
> On Thu, Jan 5, 2023 at 7:35 PM Takayuki 'January June' Suwa
> wrote:
>> On second thought, it cannot be a good idea to split addition/subtraction to
>> the stack pointer.
>>
>>> -4aaf: b0a192 movia9, 0x1b0
>>> -4ab2: 1f9a
On 2023/01/06 6:32, Max Filippov wrote:
> Hi Suwa-san,
Hi!
>
> On Thu, Jan 5, 2023 at 3:57 AM Takayuki 'January June' Suwa
> wrote:
>>
>> This patch introduces a convenient helper function for integer immediate
>> addition with scratch register as needed, that splits and emits either
>> up to tw
This patch introduces a convenient helper function for integer immediate
addition with scratch register as needed, that splits and emits either
up to two ADDI/ADDMI machine instructions or an addition by register
following an immediate integer load (which may later be transformed by
constantsynth).
Parhaps no problem, but for safety.
gcc/ChangeLog:
* config/xtensa/xtensa.cc (xtensa_expand_prologue): Fix to check
DF availability before use of DF_* macros.
---
gcc/config/xtensa/xtensa.cc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/config/xtensa/xte
Almost cosmetic and no functional changes.
gcc/ChangeLog:
* config/xtensa/*: Tabify, and trim trailing spaces.
* config/xtensa/xtensa.h (GP_RETURN, GP_RETURN_REG_COUNT):
Change to GP_RETURN_FIRST and GP_RETURN_LAST, respectively.
* config/xtensa/xtensa.cc (xtensa_f
On 2022/09/13 4:34, Max Filippov wrote:
Hi!
> On Sun, Sep 11, 2022 at 1:50 PM Takayuki 'January June' Suwa
> wrote:
>>
>> This patch implements new target hook TARGET_CONSTANT_OK_FOR_CPROP_P in
>> order to exclude CONST_INTs that cannot fit into a MOVI machine instruction
>> from cprop.
>>
>> gcc
This patch implements new target hook TARGET_CONSTANT_OK_FOR_CPROP_P in
order to exclude CONST_INTs that cannot fit into a MOVI machine instruction
from cprop.
gcc/ChangeLog:
* config/xtensa/xtensa.c (TARGET_CONSTANT_OK_FOR_CPROP_P):
New macro definition.
(xtensa_constant_
Hi,
Many RISC machines, as we know, have some restrictions on placing
register-width constants in the source of load-immediate machine instructions,
so the target must provide a solution for that in the machine description.
A naive way would be to solve it early, ie. to replace with read consta
This patch adds a new 3-instructions constant synthesis pattern:
- A value that can fit into a signed 12-bit after a number of either bitwise
left or right rotations:
=> "MOVI(.N) Ax, simm12" + "SSAI (1 ... 11) or (21 ... 31)"
+ "SRC Ax, Ax, Ax"
gcc/ChangeLog:
* config/xten
Changes from v3:
(xtensa_expand_prologue): Changed to exclude debug insns from DF use chain
analysis.
---
In the example below, 'x' is once placed on the stack frame and then read
into registers as the argument value of bar():
/* example */
struct foo {
int a, b;
};
exte
Changes from v2:
(xtensa_expand_prologue): Changed to check conditions for suppressing emit
insns in advance, instead of tracking emitted and later replacing them with
NOPs if they are found to be unnecessary.
---
In the example below, 'x' is once placed on the stack frame and then read
into
Changes from v1:
(xtensa_expand_epilogue): Fixed forgetting to consider hard_frame_pointer_rtx
when sharing codes.
---
In the example below, 'x' is once placed on the stack frame and then read
into registers as the argument value of bar():
/* example */
struct foo {
int a, b;
In the example below, 'x' is once placed on the stack frame and then read
into registers as the argument value of bar():
/* example */
struct foo {
int a, b;
};
extern struct foo bar(struct foo);
struct foo test(void) {
struct foo x = { 0, 1 };
return bar(x);
This patch eliminates all clobbers for complex hard registers that will
be overwritten entirely afterwards (supersedence of
3867d414bd7d9e5b6fb2a51b1fb3d9e9e1eae9).
gcc/ChangeLog:
* config/xtensa/xtensa.md: Rewrite the split pattern that performs
the abovementioned process so that
No longer needs the dedicated hard register (A11) for the address of the
call and the split patterns for fixups, due to the introduction of appropriate
register class and constraint.
(Note: "ISC_REGS" contains a hard register A8 used as a "static chain"
pointer for nested functions, but no proble
This patch enforces the use of "addmi" machine instruction instead of
addition/subtraction with two source registers for adjusting the stack
pointer, if the adjustment fits into a signed 16-bit and is also a multiple
of 256.
/* example */
void test(void) {
char buffer[4096];
__
On 2022/08/17 4:58, Max Filippov wrote:
> Hi Suwa-san,
Hi!
>
> On Tue, Aug 16, 2022 at 5:42 AM Takayuki 'January June' Suwa
> wrote:
>>
>> In a few cases, obviously omitable add instructions can be emitted via
>> invoking gen_addsi3.
>>
>> gcc/ChangeLog:
>>
>> * config/xtensa/xtensa.md (
In a few cases, obviously omitable add instructions can be emitted via
invoking gen_addsi3.
gcc/ChangeLog:
* config/xtensa/xtensa.md (addsi3_internal): Rename from "addsi3".
(addsi3): New define_expand in order to reject integer additions of
constant zero.
---
gcc/config/
Since GCC10, the "subreg2" optimization pass was no longer tied to enabling
"subreg1" unless -fsplit-wide-types-early was turned on (PR88233). However
on the Xtensa port, the lack of "subreg2" can degrade the quality of the
output code, especially for those that produce many D[FC]mode pseudos.
Th
(sorry repost due to the lack of cc here)
Hi!
On 2022/08/04 18:49, Richard Sandiford wrote:
> Takayuki 'January June' Suwa writes:
>> Thanks for your response.
>>
>> On 2022/08/03 16:52, Richard Sandiford wrote:
>>> Takayuki 'January June'
Thanks for your response.
On 2022/08/03 16:52, Richard Sandiford wrote:
> Takayuki 'January June' Suwa via Gcc-patches writes:
>> Emitting "(clobber (reg X))" before "(set (subreg (reg X)) (...))" keeps
>> data flow consistent, but it also increases r
Emitting "(clobber (reg X))" before "(set (subreg (reg X)) (...))" keeps
data flow consistent, but it also increases register allocation pressure
and thus often creates many unwanted register-to-register moves that
cannot be optimized away. It seems just analogous to partial register
stall which i
The hard register A10 was already allocated for EH_RETURN_STACKADJ_RTX.
(although exception handling and sibling call may not apply at the same time,
but for safety)
gcc/ChangeLog:
* config/xtensa/xtensa.md: Change hard register number used in
the split patterns for indirect sibl
It takes one machine instruction for both condtional branch and move.
gcc/ChangeLog:
* config/xtensa/xtensa.cc (xtensa_rtx_costs):
Add new case for IF_THEN_ELSE.
---
gcc/config/xtensa/xtensa.cc | 1 +
1 file changed, 1 insertion(+)
diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/
The RTL combiner will transform "if ((x & C) == C) goto label;"
into "if ((~x & C) == 0) goto label;" and will try to match it with
the insn patterns.
/* example */
void test_0(int a) {
if ((char)a == 255)
foo();
}
void test_1(int a) {
if ((unsigned short)a == 0
This patch corrects the overestimation of the relative cost of
'(set (reg) (const_int N))' where N fits into the instruction itself.
In fact, such overestimation confuses the RTL loop invariant motion pass.
As a result, it brings almost no negative impact from the speed point of
view, but addtiion
This patch allows the constant synthesis to choose shorter instruction
if possible.
/* example */
int test(void) {
return 128 << 8;
}
;; before
test:
movia2, 0x100
addmi a2, a2, 0x7f00
ret.n
;; after
test:
movi.n a2, 1
This patch enhances the effectiveness of the previously posted one:
"xtensa: Optimize bitwise AND operation with some specific forms of constants".
/* example */
extern void foo(int);
void test(int a) {
if ((a & (-1U << 8)) == (128 << 8)) /* 0 or one of "b4const" */
foo(
This patch fixes an non-fatal issue about negative constant values derived
from FP constant synthesis on hosts whose 'long' is wider than 'int32_t'.
And also replaces the dedicated code in FP constant synthesis split
pattern with the appropriate existing function call.
gcc/ChangeLog:
* c
On 2022/07/07 23:46, Jeff Law wrote:
> This is an update to a patch originally posted by Takayuki Suwa a few months
> ago.
>
> When we initialize an array from a STRING_CST we perform the initialization
> in two steps. The first step copies the STRING_CST to the destination. The
> second step
Such constants are often subject to the constant synthesis:
int test(int a) {
return a - 31999;
}
test:
movia3, 1
addmi a3, a3, -0x7d00
add a2, a2, a3
ret
This patch optimizes such case as follows:
test:
addia2, a2, 1
Fortify buffer overflow message reported.
(see https://github.com/earlephilhower/esp-quick-toolchain/issues/36)
gcc/ChangeLog:
* config/xtensa/xtensa.md (bswapsi2_internal):
Enlarge the buffer that is obviously smaller than the template
string given to sprintf().
---
gcc/
No functional changes.
gcc/ChangeLog:
* config/xtensa/xtensa.cc (xtensa_emit_move_sequence):
Use can_create_pseudo_p(), instead of using individual
reload_in_progress and reload_completed.
(xtensa_expand_block_set_small_loop): Use xtensa_simm8x256(),
the ex
These instructions will all be converted to L32R ones with litpool entries
by the assembler.
gcc/ChangeLog:
* config/xtensa/xtensa.cc (xtensa_is_insn_L32R_p):
Consider relaxed MOVI instructions as L32R.
---
gcc/config/xtensa/xtensa.cc | 22 ++
1 file changed,
erratum:
- extern unsigned int value;
+ extern unsigned short value;
On 2022/06/17 22:47, Takayuki 'January June' Suwa via Gcc-patches wrote:
> Storing integer constants into litpool in the early stage of compilation
> hinders some integer optimizations. In fa
Storing integer constants into litpool in the early stage of compilation
hinders some integer optimizations. In fact, such integer constants are
not subject to the constant folding process.
For example:
extern unsigned int value;
extern void foo(void);
void test(void) {
if (val
On 2022/06/15 5:17, Max Filippov wrote:
> Hi Suwa-san,
hi!
> This change results in a bunch of new regression test failures:
> The code generated for e.g. gcc.c-torture/execute/921208-2.c looks like this:
oh, PICed...
indirect (incl. via function pointer, virtual functions and of course PIC ones
1 - 100 of 136 matches
Mail list logo