[PATCH] Move get_call_rtx_from to final.c

2025-06-01 Thread H.J. Lu
(get_call_rtx_from): Removed. Tested on x86-64. -- H.J. From 5f68d5644870e3984343240b26918ee97e3e6f17 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Sun, 1 Jun 2025 09:29:48 +0800 Subject: [PATCH] Move get_call_rtx_from to final.c Move get_call_rtx_from to final.c and call call_from_call_

[PATCH] Also check function symbol for function declaration

2025-05-31 Thread H.J. Lu
SI (use (reg:SI 5 di)) (expr_list:SI (use (reg:SI 4 si)) (nil PR other/120494 * rtlanal.cc (get_call_fndecl): Also check function symbol to get function declaration. -- H.J. From 6e20aad0c0b02c688f93ebc20b160f31c23adc82 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Sat, 31 M

[PATCH v3] x86: Enable *mov_(and|or) only for -Oz

2025-05-25 Thread H.J. Lu
On Sun, May 25, 2025 at 7:02 PM H.J. Lu wrote: > > On Sun, May 25, 2025 at 8:12 AM H.J. Lu wrote: > > > > On Sun, May 25, 2025 at 7:47 AM H.J. Lu wrote: > > > > > > commit ef26c151c14a87177d46fd3d725e7f82e040e89f > > > Author: Roger Sayle

PATCH v2] x86: Enable *mov_(and|or_store) only for -Oz

2025-05-25 Thread H.J. Lu
On Sun, May 25, 2025 at 8:12 AM H.J. Lu wrote: > > On Sun, May 25, 2025 at 7:47 AM H.J. Lu wrote: > > > > commit ef26c151c14a87177d46fd3d725e7f82e040e89f > > Author: Roger Sayle > > Date: Thu Dec 23 12:33:07 2021 + > > > > x86: PR target/1

[PATCH] x86: Enable *mov_(and|or) only for -Oz

2025-05-24 Thread H.J. Lu
On Sun, May 25, 2025 at 7:47 AM H.J. Lu wrote: > > commit ef26c151c14a87177d46fd3d725e7f82e040e89f > Author: Roger Sayle > Date: Thu Dec 23 12:33:07 2021 + > > x86: PR target/103773: Fix wrong-code with -Oz from pop to memory. > > transformed "mov $0,me

[PATCH] x86: Enable *mov_and only for -Oz

2025-05-24 Thread H.J. Lu
stsuite/ PR target/120427 * gcc.target/i386/pr120427.c: New test. OK for master? -- H.J. From ff829a2a7e13e1f6b1333f169b2f6adae6a5c192 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Sun, 25 May 2025 07:40:29 +0800 Subject: [PATCH] x86: Enable *mov_and only for -Oz commit ef26c1

Re: [PATCH] x86: Add preserve_none and update no_caller_saved_registers attributes

2025-05-22 Thread H.J. Lu
On Wed, May 14, 2025 at 2:12 PM Hongtao Liu wrote: > > On Fri, Apr 18, 2025 at 7:10 PM H.J. Lu wrote: > > > > Add preserve_none attribute which is similar to no_callee_saved_registers > > attribute, except on x86-64, r12, r13, r14, r15, rdi and rsi registers are > Co

[PATCH v2] x86: Add preserve_none and update no_caller_saved_registers attributes

2025-05-22 Thread H.J. Lu
gcc.target/i386/preserve-none-27.c: Likewise. * gcc.target/i386/preserve-none-28.c: Likewise. * gcc.target/i386/preserve-none-29.c: Likewise. * gcc.target/i386/preserve-none-30a.c: Likewise. * gcc.target/i386/preserve-none-30b.c: Likewise. Signed-off-by: H.J. Lu

[PATCH] x86: Remove df_insn_rescan after emit_insn_*

2025-05-11 Thread H.J. Lu
): Likewise. (replace_vector_const): Likewise. OK for master? -- H.J. From 6fbdc43bfc32ed6c88891f84bd367696cca1e247 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Mon, 12 May 2025 10:02:24 +0800 Subject: [PATCH] x86: Remove df_insn_rescan after emit_insn_* Since df_insn_rescan has been

Re: i386: Fix some problems in stv cost model

2025-05-10 Thread H.J. Lu
On Sun, May 11, 2025 at 4:28 AM Jan Hubicka wrote: > > Hi, > this patch fixes some of problems with cosint in scalar to vector pass. > In particular This caused: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120215 > 1) the pass uses optimize_insn_for_size which is intended to be used by >

[PATCH] x86: Change dest to src in replace_vector_const

2025-05-10 Thread H.J. Lu
:00 2001 From: "H.J. Lu" Date: Sun, 11 May 2025 06:17:45 +0800 Subject: [PATCH] x86: Change dest to src in replace_vector_const Replace rtx dest = SET_SRC (set); with rtx src = SET_SRC (set); in replace_vector_const to avoid confusion. * config/i386/i386-f

[PATCH v2 2/3] x86: Add a pass to fold tail call

2025-05-07 Thread H.J. Lu
ewise. * gcc.target/i386/pr47253-7b.c: Likewise. * gcc.target/i386/pr47253-8.c: Likewise. Signed-off-by: H.J. Lu --- gcc/config/i386/i386-features.cc | 237 + gcc/config/i386/i386-passes.def| 1 + gcc/config/i386/i386-protos.h |

[PATCH v2 3/3] x86: Fold sibcall targets into jump table

2025-05-07 Thread H.J. Lu
ise. * gcc.target/i386/pr14721-2b.c: Likewise. * gcc.target/i386/pr14721-2c.c: Likewise. * gcc.target/i386/pr14721-3c.c: Likewise. * gcc.target/i386/pr14721-3b.c: Likewise. * gcc.target/i386/pr14721-3c.c: Likewise. Signed-off-by: H.J. Lu --- gc

[PATCH v2 1/3] Support symbol reference in jump label and jump table

2025-05-07 Thread H.J. Lu
6_notrack_prefixed_insn_p): Likewise. * doc/rtl.texi (addr_vec): Also allow symbol reference. (JUMP_LABEL): Likewise. Signed-off-by: H.J. Lu --- gcc/config/i386/i386-expand.cc | 5 - gcc/doc/rtl.texi | 24 +-- g

[PATCH v2 0/3] x86: Add a pass to fold tail call

2025-05-07 Thread H.J. Lu
target basic block with only a direct sibcall, change the entry to point to the sibcall target, decrement the target basic block entry label use count and redirect the edge to the exit basic block. H.J. Lu (3): Support symbol reference in jump label and jump table x86: Add a pass to fold t

PING: [PATCH] Add TARGET_STORE_BY_PIECES_ICODE

2025-05-07 Thread H.J. Lu
On Mon, Apr 28, 2025 at 8:57 PM H.J. Lu wrote: > > On x86, both stores with 32-bit immediate and register are supported: > >0: 48 c7 40 10 00 00 00 00 movq $0x0,0x10(%rax) >8: 48 89 50 10 movq %rdx,0x10(%rax) > > But store with 32-bit immediate is 4

[PATCH v3] x86: Insert extra move for mode size smaller than natural size

2025-05-06 Thread H.J. Lu
On Wed, May 7, 2025 at 9:25 AM H.J. Lu wrote: > > On Wed, May 7, 2025 at 9:17 AM Hongtao Liu wrote: > > > > On Wed, May 7, 2025 at 9:06 AM H.J. Lu wrote: > > > > > > On Tue, May 6, 2025 at 3:35 PM Hongtao Liu wrote: > > > > >

Re: [PATCH v2] x86: Insert extra move for mode size smaller than natural size

2025-05-06 Thread H.J. Lu
On Wed, May 7, 2025 at 9:17 AM Hongtao Liu wrote: > > On Wed, May 7, 2025 at 9:06 AM H.J. Lu wrote: > > > > On Tue, May 6, 2025 at 3:35 PM Hongtao Liu wrote: > > > > > > On Tue, May 6, 2025 at 3:06 PM H.J. Lu wrote: > > > > > > > >

[PATCH v2] x86: Insert extra move for mode size smaller than natural size

2025-05-06 Thread H.J. Lu
On Tue, May 6, 2025 at 3:35 PM Hongtao Liu wrote: > > On Tue, May 6, 2025 at 3:06 PM H.J. Lu wrote: > > > > On Tue, May 6, 2025 at 2:30 PM Liu, Hongtao wrote: > > > > > > > > > > > > > -Original Message- > > > > From

Re: [PATCH 2/3] x86: Add a pass to fold tail call

2025-05-06 Thread H.J. Lu
On Wed, May 7, 2025 at 12:29 AM Andi Kleen wrote: > > On 2025-05-06 09:48, H.J. Lu wrote: > > On Mon, May 5, 2025 at 9:56 PM Andi Kleen wrote: > >> > >> On Mon, May 05, 2025 at 06:20:40AM -0700, Andi Kleen wrote: > >> > > If the branch edge de

Re: [PATCH 2/3] x86: Add a pass to fold tail call

2025-05-06 Thread H.J. Lu
On Mon, May 5, 2025 at 9:56 PM Andi Kleen wrote: > > On Mon, May 05, 2025 at 06:20:40AM -0700, Andi Kleen wrote: > > > If the branch edge destination is a basic block with only a direct > > > sibcall, change the jcc target to the sibcall target, decrement the > > > destination basic block entry la

Re: [PATCH 2/3] x86: Add a pass to fold tail call

2025-05-06 Thread H.J. Lu
On Mon, May 5, 2025 at 9:20 PM Andi Kleen wrote: > > > If the branch edge destination is a basic block with only a direct > > sibcall, change the jcc target to the sibcall target, decrement the > > destination basic block entry label use count and redirect the edge > > to the exit basic block. Ca

Re: [PATCH] x86: Skip if the mode size is smaller than its natural size

2025-05-06 Thread H.J. Lu
On Tue, May 6, 2025 at 2:30 PM Liu, Hongtao wrote: > > > > > -Original Message----- > > From: H.J. Lu > > Sent: Tuesday, May 6, 2025 2:16 PM > > To: Liu, Hongtao > > Cc: GCC Patches ; Uros Bizjak > > > > Subject: Re: [PATCH] x86: Skip if

Re: [PATCH] x86: Skip if the mode size is smaller than its natural size

2025-05-05 Thread H.J. Lu
On Tue, May 6, 2025 at 10:54 AM Liu, Hongtao wrote: > > > > > -Original Message----- > > From: H.J. Lu > > Sent: Thursday, May 1, 2025 6:39 AM > > To: GCC Patches ; Uros Bizjak > > ; Liu, Hongtao > > Subject: [PATCH] x86: Skip if the mode size

[PATCH 3/3] x86: Fold sibcall targets into jump table

2025-05-04 Thread H.J. Lu
ise. * gcc.target/i386/pr14721-2b.c: Likewise. * gcc.target/i386/pr14721-2c.c: Likewise. * gcc.target/i386/pr14721-3c.c: Likewise. * gcc.target/i386/pr14721-3b.c: Likewise. * gcc.target/i386/pr14721-3c.c: Likewise. Signed-off-by: H.J. Lu --- gc

[PATCH 0/3] x86: Add a pass to fold tail call

2025-05-04 Thread H.J. Lu
c block. If the jump table entry points to a target basic block with only a direct sibcall, change the entry to point to the sibcall target, decrement the target basic block entry label use count and redirect the edge to the exit basic block. Call delete_unreachable_blocks to delete the unreachable basi

[PATCH 2/3] x86: Add a pass to fold tail call

2025-05-04 Thread H.J. Lu
53-5.c: Likewise. * gcc.target/i386/pr47253-6.c: Likewise. * gcc.target/i386/pr47253-7a.c: Likewise. * gcc.target/i386/pr47253-7b.c: Likewise. * gcc.target/i386/pr47253-8.c: Likewise. Signed-off-by: H.J. Lu --- gcc/config/i386/i386-features.cc | 204

[PATCH 1/3] Support symbol reference in jump label and jump table

2025-05-04 Thread H.J. Lu
x argument. * rtl.h (condsibcall_p): New. * rtlanal.cc (tablejump_p): Return false if JUMP_LABEL is a symbol reference. * config/i386/i386-expand.cc (ix86_notrack_prefixed_insn_p): Likewise. * doc/rtl.texi (addr_vec): Also allow

[PATCH v2] Use incoming small integer argument value as if promoted

2025-05-04 Thread H.J. Lu
On Wed, Apr 30, 2025 at 2:43 PM Richard Biener wrote: > > On Tue, Apr 29, 2025 at 3:53 PM H.J. Lu wrote: > > > > On Tue, Apr 29, 2025 at 9:34 PM Richard Biener > > wrote: > > > > > > On Tue, Apr 29, 2025 at 2:33 PM H.J. Lu wrote: > > > >

Re: [PATCH] x86-64: Don't expand UNSPEC_TLS_LD_BASE to a call

2025-05-01 Thread H.J. Lu
On Wed, Apr 30, 2025 at 7:40 PM Uros Bizjak wrote: > > On Tue, Apr 29, 2025 at 12:22 PM H.J. Lu wrote: > > > > On Tue, Apr 29, 2025 at 5:30 PM Uros Bizjak wrote: > > > > > > On Tue, Apr 29, 2025 at 9:56 AM H.J. Lu wrote: > > > > > > > &g

Re: [PATCH] vect: Use internal storage for converts for call into supportable_indirect_convert_operation [PR118617]

2025-05-01 Thread H.J. Lu
On Fri, May 2, 2025 at 7:28 AM Andrew Pinski wrote: > > While looking into PR 118616, I noticed that > supportable_indirect_convert_operation only pushes up to 2 into its vec. > And the 2 places which call supportable_indirect_convert_operation, > use an auto_vec but without an internal storage. I

Re: [PATCH] x86: Remove BREG from ix86_class_likely_spilled_p

2025-05-01 Thread H.J. Lu
On Thu, May 1, 2025 at 2:56 PM Uros Bizjak wrote: > > On Wed, Apr 30, 2025 at 11:31 PM H.J. Lu wrote: > > > > On Wed, Apr 30, 2025 at 7:37 PM Uros Bizjak wrote: > > > > > > On Tue, Apr 29, 2025 at 11:40 PM H.J. Lu wrote: > > > >

Re: [PATCH] x86: Update TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P

2025-04-30 Thread H.J. Lu
On Wed, Apr 30, 2025 at 7:48 PM Uros Bizjak wrote: > > On Tue, Apr 29, 2025 at 11:40 PM H.J. Lu wrote: > > > > SMALL_REGISTER_CLASSES was added by > > > > commit c98f874233428d7e6ba83def7842fd703ac0ddf1 > > Author: James Van Artsdalen >

[PATCH] x86: Skip if the mode size is smaller than its natural size

2025-04-30 Thread H.J. Lu
(remove_redundant_vector_load): Also skip if the mode size is smaller than its natural size. gcc/testsuite/ PR target/120036 * g++.target/i386/pr120036.C: New test. -- H.J. From 6bfacf6014965d3ec498620dd9951efca9ad6015 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Thu, 1 May 2025 06:3

Re: [PATCH] x86: Remove SSE_FIRST_REG from ix86_class_likely_spilled_p

2025-04-30 Thread H.J. Lu
On Wed, Apr 30, 2025 at 8:12 PM Uros Bizjak wrote: > > On Tue, Apr 29, 2025 at 11:40 PM H.J. Lu wrote: > > > > SSE_FIRST_REG was added to CLASS_LIKELY_SPILLED_P, which became > > TARGET_CLASS_LIKELY_SPILLED_P, for > > > > https://gcc.gnu.org/bugzilla/show_b

Re: [PATCH] x86: Remove BREG from ix86_class_likely_spilled_p

2025-04-30 Thread H.J. Lu
On Wed, Apr 30, 2025 at 7:37 PM Uros Bizjak wrote: > > On Tue, Apr 29, 2025 at 11:40 PM H.J. Lu wrote: > > > > AREG, DREG, CREG and AD_REGS are kept in ix86_class_likely_spilled_p to > > avoid the following regressions with > > > > $ make check RUN

[PATCH] x86: Remove BREG from ix86_class_likely_spilled_p

2025-04-29 Thread H.J. Lu
l: move pthread_once into libc and built Linux kernel 6.13.5 on x86-64. PR target/119083 * config/i386/i386.cc (ix86_class_likely_spilled_p): Remove CREG and BREG. Signed-off-by: H.J. Lu --- gcc/config/i386/i386.cc | 1 - 1 file changed, 1 deletion(-) diff --git a/gcc/c

[PATCH] x86: Remove SSE_FIRST_REG from ix86_class_likely_spilled_p

2025-04-29 Thread H.J. Lu
f vpermi2w. * gcc.target/i386/avx512fp16-builtin_shuffle-1.c: Likewise. * gcc.target/i386/vpermt2-special-bf16-shufflue.c: Likewise. * gcc.target/i386/pr101846-4.c: Scan vpermt2b instead of vpermi2b. Signed-off-by: H.J. Lu --- gcc/config/i386/i386

[PATCH] x86: Update TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P

2025-04-29 Thread H.J. Lu
* ira.cc (decrease_live_ranges_number): Skip hard register if targetm.class_likely_spilled_p returns true. * config/i386/i386.cc (ix86_small_register_classes_for_mode_p): New. (TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P): Use it. Signed-off-by: H.J. Lu --- gcc

Re: [PATCH] Use incoming small integer argument value if possible

2025-04-29 Thread H.J. Lu
On Tue, Apr 29, 2025 at 9:34 PM Richard Biener wrote: > > On Tue, Apr 29, 2025 at 2:33 PM H.J. Lu wrote: > > > > On Tue, Apr 29, 2025 at 6:46 PM Richard Biener > > wrote: > > > > > > On Tue, Apr 29, 2025 at 12:32 PM H.J. Lu wrote: > > > >

Re: [pushed] i386: Allow string instructions from non-default address space [PR111657]

2025-04-29 Thread H.J. Lu
On Tue, Apr 29, 2025 at 6:49 PM Uros Bizjak wrote: > > On Tue, Apr 29, 2025 at 12:41 PM H.J. Lu wrote: > > > > On Tue, Apr 29, 2025 at 5:52 PM Uros Bizjak wrote: > > > > > > MOVS instructions allow segment override of their source operand, e.g.: > &g

Re: [PATCH] Use incoming small integer argument value if possible

2025-04-29 Thread H.J. Lu
On Tue, Apr 29, 2025 at 6:46 PM Richard Biener wrote: > > On Tue, Apr 29, 2025 at 12:32 PM H.J. Lu wrote: > > > > On Tue, Apr 29, 2025 at 5:56 PM Richard Biener > > wrote: > > > > > > On Tue, Apr 29, 2025 at 10:48 AM H.J. Lu wrote: > > > >

Re: [pushed] i386: Allow string instructions from non-default address space [PR111657]

2025-04-29 Thread H.J. Lu
On Tue, Apr 29, 2025 at 5:52 PM Uros Bizjak wrote: > > MOVS instructions allow segment override of their source operand, e.g.: > > rep movsq %gs:(%rsi), (%rdi) > > where %rsi is the address of the source location (with %gs segment override) > and %rdi is the address of the destination location

Re: [PATCH] Use incoming small integer argument value if possible

2025-04-29 Thread H.J. Lu
On Tue, Apr 29, 2025 at 5:56 PM Richard Biener wrote: > > On Tue, Apr 29, 2025 at 10:48 AM H.J. Lu wrote: > > > > On Tue, Apr 29, 2025 at 4:25 PM Richard Biener > > wrote: > > > > > > On Tue, Apr 29, 2025 at 9:39 AM H.J. Lu wrote: > >

Re: [PATCH] x86-64: Don't expand UNSPEC_TLS_LD_BASE to a call

2025-04-29 Thread H.J. Lu
On Tue, Apr 29, 2025 at 5:30 PM Uros Bizjak wrote: > > On Tue, Apr 29, 2025 at 9:56 AM H.J. Lu wrote: > > > > Don't expand UNSPEC_TLS_LD_BASE to a call so that the RTL local copy > > propagation pass can eliminate multiple __tls_get_addr calls. > > __tls_get_a

Re: [PATCH] Use incoming small integer argument value if possible

2025-04-29 Thread H.J. Lu
On Tue, Apr 29, 2025 at 4:25 PM Richard Biener wrote: > > On Tue, Apr 29, 2025 at 9:39 AM H.J. Lu wrote: > > > > For targets, like x86, which define TARGET_PROMOTE_PROTOTYPES to return > > true, all integer arguments smaller than int are passed as int: > > > &

[PATCH] x86-64: Don't expand UNSPEC_TLS_LD_BASE to a call

2025-04-29 Thread H.J. Lu
to unspec. (*tls_local_dynamic_base_64_): New. gcc/testsuite/ PR target/81501 * gcc.target/i386/pr81501-1.c: New test. OK for master? Thanks. -- H.J. From d154b3bf2fb86c82a6291f1fae45fbbe0d74f4e4 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Fri, 19 Aug 2022 11:50:41 -0700 Subject: [PATCH]

[PATCH] Use incoming small integer argument value if possible

2025-04-29 Thread H.J. Lu
pr14907-21.c: Likewise. * gcc.target/i386/pr14907-22.c: Likewise. -- H.J. From a093f7cff03796cf7e72e850a722d079db3c72af Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Thu, 21 Nov 2024 09:22:40 +0800 Subject: [PATCH] Use incoming small integer argument value if possible For targets,

[PATCH v3] x86: Add a pass to remove redundant all 0s/1s vector load

2025-04-29 Thread H.J. Lu
On Tue, Apr 29, 2025 at 11:27 AM H.J. Lu wrote: > > On Tue, Apr 29, 2025 at 10:08 AM Hongtao Liu wrote: > > > > On Mon, Apr 28, 2025 at 5:07 PM H.J. Lu wrote: > > > > > > On Mon, Apr 28, 2025 at 4:26 PM H.J. Lu wrote: > > > > > > >

Re: [PATCH] i386: Add ix86_expand_unsigned_small_int_cst_argument

2025-04-29 Thread H.J. Lu
On Tue, Apr 29, 2025 at 2:51 PM Liu, Hongtao wrote: > > > > > -Original Message----- > > From: H.J. Lu > > Sent: Tuesday, April 29, 2025 1:58 PM > > To: Hongtao Liu > > Cc: GCC Patches ; Uros Bizjak > > ; Liu

Re: [PATCH] i386: Add ix86_expand_unsigned_small_int_cst_argument

2025-04-28 Thread H.J. Lu
On Tue, Apr 29, 2025 at 1:54 PM H.J. Lu wrote: > > On Tue, Apr 29, 2025 at 12:56 PM Hongtao Liu wrote: > > > > On Sun, Apr 27, 2025 at 10:58 AM H.J. Lu wrote: > > > > > > When passing 0xff as an unsigned char function argument with the C > > > fr

Re: [PATCH] i386: Add ix86_expand_unsigned_small_int_cst_argument

2025-04-28 Thread H.J. Lu
On Tue, Apr 29, 2025 at 12:56 PM Hongtao Liu wrote: > > On Sun, Apr 27, 2025 at 10:58 AM H.J. Lu wrote: > > > > When passing 0xff as an unsigned char function argument with the C frontend > > promotion, expand_normal used to get > > > > constant > &g

PING: [PATCH] x86: Add preserve_none and update no_caller_saved_registers attributes

2025-04-28 Thread H.J. Lu
On Fri, Apr 18, 2025 at 7:10 PM H.J. Lu wrote: > > Add preserve_none attribute which is similar to no_callee_saved_registers > attribute, except on x86-64, r12, r13, r14, r15, rdi and rsi registers are > used for integer parameter passing. This can be used in an interpreter >

Re: [PATCH v2] x86: Add a pass to remove redundant all 0s/1s vector load

2025-04-28 Thread H.J. Lu
On Tue, Apr 29, 2025 at 10:08 AM Hongtao Liu wrote: > > On Mon, Apr 28, 2025 at 5:07 PM H.J. Lu wrote: > > > > On Mon, Apr 28, 2025 at 4:26 PM H.J. Lu wrote: > > > > > > > > > > This is what my patch does: > > > > But it iterates throu

[PATCH] target.def: Remove TARGET_PROMOTE_FUNCTION_RETURN reference

2025-04-28 Thread H.J. Lu
:00:00 2001 From: "H.J. Lu" Date: Tue, 29 Apr 2025 09:44:29 +0800 Subject: [PATCH] target.def: Remove TARGET_PROMOTE_FUNCTION_RETURN reference Since TARGET_PROMOTE_FUNCTION_RETURN is no longer used, remove its reference from target.def. PR target/119985 * target.d

Re: [PATCH][stage1] tree-optimization/119103 - missed overwidening detection for shift

2025-04-28 Thread H.J. Lu
On Mon, Apr 28, 2025 at 9:09 PM Richard Biener wrote: > > On Wed, Mar 5, 2025 at 12:50 PM Richard Biener wrote: > > > > On Tue, 4 Mar 2025, Richard Sandiford wrote: > > > > > Richard Biener writes: > > > > When vectorizing a shift of u16 data by an amount that's known to > > > > be less than 16

[PATCH] Add TARGET_STORE_BY_PIECES_ICODE

2025-04-28 Thread H.J. Lu
85b3f9e34389cba742ca11cacc88b98877be2151 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Mon, 21 Apr 2025 21:12:35 +0800 Subject: [PATCH] Add TARGET_STORE_BY_PIECES_ICODE On x86, both stores with 32-bit immediate and register are supported: 0: 48 c7 40 10 00 00 00 00 movq $0x0,0x10(%rax) 8: 4

[PATCH v2] x86: Add a pass to remove redundant all 0s/1s vector load

2025-04-28 Thread H.J. Lu
On Mon, Apr 28, 2025 at 4:26 PM H.J. Lu wrote: > > > > This is what my patch does: > > But it iterates through vector_insns, using a def-ref chain to find > > those insns. I think we can just record those single_set with src as > > const_m1/zero, and replace src for

Re: PING: [PATCH] x86: Add a pass to remove redundant all 0s/1s vector load

2025-04-28 Thread H.J. Lu
On Tue, Apr 22, 2025 at 10:01 AM Hongtao Liu wrote: > > On Mon, Apr 21, 2025 at 4:30 PM H.J. Lu wrote: > > > > On Mon, Apr 21, 2025 at 11:29 AM Hongtao Liu wrote: > > > > > > On Sat, Apr 19, 2025 at 1:25 PM H.J. Lu wrote: > > > > >

[PATCH v3] x86: Properly find the maximum stack slot alignment

2025-04-27 Thread H.J. Lu
R target/109093 * g++.target/i386/pr109780-1.C: New test. * gcc.target/i386/pr109093-1.c: Likewise. * gcc.target/i386/pr109780-1.c: Likewise. * gcc.target/i386/pr109780-2.c: Likewise. * gcc.target/i386/pr109780-3.c: Likewise. -- H.J. From 2233834e398711b65c8b8eeefbf6fa830a6c2974 Mon Sep 17 00:00:00

[PATCH] i386: Add ix86_expand_unsigned_small_int_cst_argument

2025-04-26 Thread H.J. Lu
When passing 0xff as an unsigned char function argument with the C frontend promotion, expand_normal used to get constant 255> and returned the rtx value using the sign-extended representation: (const_int 255 [0xff]) But after commit a670ebde3995481225ec62b29686ec07a21e5c10 Author: H.J.

RFC v2: Add TARGET_STORE_BY_PIECES_ICODE

2025-04-21 Thread H.J. Lu
On Mon, Apr 21, 2025 at 9:38 PM H.J. Lu wrote: > > On Mon, Apr 21, 2025 at 6:34 PM Jan Hubicka wrote: > ... > > We originally put CLEAR_RATIO < MOVE_RATIO based on observation that > > mov $0, mem > > is longer in encoding than > > mov mem, mem &

RFC: Add TARGET_STORE_BY_PIECES_ICODE

2025-04-21 Thread H.J. Lu
moves, but it did not materialize (yet). With SSE this problem > disappears since SSE stores does not have immediates anyway. Here is a patch to implement it with UNSPEC_STORE_BY_PIECES. How does it look? -- H.J. From c021053a4fea121a3c4a593b2907701c42a626bc Mon Sep 17 00:00:00 2001 From: "H.J. L

Re: [PATCH v2] x86: Update memcpy/memset inline strategies for -mtune=generic

2025-04-21 Thread H.J. Lu
On Mon, Apr 21, 2025 at 6:34 PM Jan Hubicka wrote: > > > On Mon, Apr 21, 2025 at 7:24 AM H.J. Lu wrote: > > > > > > On Sun, Apr 20, 2025 at 6:31 PM Jan Hubicka wrote: > > > > > > > > > PR target/102294 > > > > >

[PATCH v2] x86: Properly find the maximum stack slot alignment

2025-04-21 Thread H.J. Lu
On Mon, Apr 21, 2025 at 3:06 PM Uros Bizjak wrote: > > On Sun, Apr 20, 2025 at 11:26 PM H.J. Lu wrote: > > > > Don't assume that stack slots can only be accessed by stack or frame > > registers. We first find all registers defined by stack or frame > > regist

Re: PING: [PATCH] x86: Add a pass to remove redundant all 0s/1s vector load

2025-04-21 Thread H.J. Lu
On Mon, Apr 21, 2025 at 11:29 AM Hongtao Liu wrote: > > On Sat, Apr 19, 2025 at 1:25 PM H.J. Lu wrote: > > > > On Sun, Dec 1, 2024 at 7:50 AM H.J. Lu wrote: > > > > > > For all different modes of all 0s/1s vectors, we can use the single widest > >

Re: [PATCH v2] x86: Update memcpy/memset inline strategies for -mtune=generic

2025-04-20 Thread H.J. Lu
On Mon, Apr 21, 2025 at 7:24 AM H.J. Lu wrote: > > On Sun, Apr 20, 2025 at 6:31 PM Jan Hubicka wrote: > > > > > PR target/102294 > > > PR target/119596 > > > * config/i386/x86-tune-costs.h (generic_memcpy): Updated.

Re: [PATCH v2] x86: Update memcpy/memset inline strategies for -mtune=generic

2025-04-20 Thread H.J. Lu
On Sun, Apr 20, 2025 at 6:31 PM Jan Hubicka wrote: > > > PR target/102294 > > PR target/119596 > > * config/i386/x86-tune-costs.h (generic_memcpy): Updated. > > (generic_memset): Likewise. > > (generic_cost): Change CLEAR_RATIO to 17. > > * config/i386/x86-tune.

[PATCH] x86: Properly find the maximum stack slot alignment

2025-04-20 Thread H.J. Lu
ewise. * gcc.target/i386/pr109780-3.c: Likewise. Signed-off-by: H.J. Lu --- gcc/config/i386/i386.cc| 174 ++--- gcc/testsuite/g++.target/i386/pr109780-1.C | 72 + gcc/testsuite/gcc.target/i386/pr109093-1.c | 33 gcc/testsuite/gcc.target/i386/pr109780-1.c

Re: PING: [PATCH v2] x86: Add pcmpeq splitters

2025-04-20 Thread H.J. Lu
On Sat, Apr 19, 2025 at 4:16 PM Uros Bizjak wrote: > > On Sat, Apr 19, 2025 at 7:22 AM H.J. Lu wrote: > > > > On Mon, Dec 2, 2024 at 6:27 AM H.J. Lu wrote: > > > > > > Add pcmpeq splitters to split > > > > > > (insn 5 3 7 2 (

Re: [PATCH] simplify-rtx: Fix shortcut for vector eq/ne

2025-04-20 Thread H.J. Lu
eop1) > + || CONST_VECTOR_P (trueop1))) >&& (tem = simplify_binary_operation (MINUS, mode, op0, op1)) != 0 >/* We cannot do this if tem is a nonzero address. */ >&& ! nonzero_address_p (tem)) > return simplify_const_relatio

Re: [PATCH v2] x86: Update memcpy/memset inline strategies for -mtune=generic

2025-04-19 Thread H.J. Lu
On Sun, Apr 20, 2025 at 4:19 AM Jan Hubicka wrote: > > > On Tue, Apr 8, 2025 at 3:52 AM H.J. Lu wrote: > > > > > > Simplify memcpy and memset inline strategies to avoid branches for > > > -mtune=generic: > > > > > > 1. With MOVE_RA

PING: [PATCH] x86: Add a pass to remove redundant all 0s/1s vector load

2025-04-18 Thread H.J. Lu
On Sun, Dec 1, 2024 at 7:50 AM H.J. Lu wrote: > > For all different modes of all 0s/1s vectors, we can use the single widest > all 0s/1s vector register for all 0s/1s vector uses in the whole function. > Add a pass to generate a single widest all 0s/1s vector set instruction at &g

PING: [PATCH v2] x86: Add pcmpeq splitters

2025-04-18 Thread H.J. Lu
On Mon, Dec 2, 2024 at 6:27 AM H.J. Lu wrote: > > Add pcmpeq splitters to split > > (insn 5 3 7 2 (set (reg:V4SI 100) > (eq:V4SI (reg:V4SI 98) > (reg:V4SI 98))) 7910 {*sse2_eqv4si3} > (expr_list:REG_DEAD (reg:V4SI 98) > (expr

[PATCH] x86: Add preserve_none and update no_caller_saved_registers attributes

2025-04-18 Thread H.J. Lu
-29.c: Likewise. * gcc.target/i386/preserve-none-30a.c: Likewise. * gcc.target/i386/preserve-none-30b.c: Likewise. Signed-off-by: H.J. Lu --- gcc/config/i386/i386-expand.cc| 6 +- gcc/config/i386/i386-options.cc | 90 -- gcc/config/i386/i386-protos.h

Re: [PATCH] x86: Update gcc.target/i386/apx-interrupt-1.c

2025-04-16 Thread H.J. Lu
On Tue, Apr 15, 2025 at 12:19 PM Uros Bizjak wrote: > > On Tue, Apr 15, 2025 at 2:23 PM H.J. Lu wrote: > > > > On Tue, Apr 15, 2025 at 12:45 AM Uros Bizjak wrote: > > > > > > On Tue, Apr 15, 2025 at 1:06 AM H.J. Lu wrote: > > > > > > >

[PATCH][GCC14] Extend check-function-bodies to allow label and directives

2025-04-15 Thread H.J. Lu
tarts with ".L". Signed-off-by: H.J. Lu (cherry picked from commit d6bb1e257fc414d21bc31faa7ddecbc93a197e3c) --- gcc/doc/sourcebuild.texi | 9 ++--- gcc/testsuite/gcc.target/i386/pr116174.c | 18 +++--- gcc/testsuite/lib/scanasm.exp| 15 +++

[PATCH] x86: Update gcc.target/i386/apx-interrupt-1.c

2025-04-15 Thread H.J. Lu
ix86_add_cfa_restore_note omits the REG_CFA_RESTORE REG note for registers pushed in red-zone. Since commit 0a074b8c7e79f9d9359d044f1499b0a9ce9d2801 Author: H.J. Lu Date: Sun Apr 13 12:20:42 2025 -0700 APX: Don't use red-zone with 32 GPRs and no caller-saved registers disabled red

Re: [PATCH] x86: Update gcc.target/i386/apx-interrupt-1.c

2025-04-15 Thread H.J. Lu
On Tue, Apr 15, 2025 at 12:45 AM Uros Bizjak wrote: > > On Tue, Apr 15, 2025 at 1:06 AM H.J. Lu wrote: > > > > ix86_add_cfa_restore_note omits the REG_CFA_RESTORE REG note for registers > > pushed in red-zone. Since > > > > commit 0a074b8c7e79f9d9359d044f

Re: [PATCH] APX: Don't use red-zone with APX and no caller-saved registers

2025-04-14 Thread H.J. Lu
On Mon, Apr 14, 2025 at 2:39 AM Uros Bizjak wrote: > > On Mon, Apr 14, 2025 at 8:54 AM Hongtao Liu wrote: > > > > On Mon, Apr 14, 2025 at 7:36 AM H.J. Lu wrote: > > > > > > Don't use red-zone when there are no caller-saved registers and APX is > >

[PATCH] APX: Don't use red-zone with APX and no caller-saved registers

2025-04-13 Thread H.J. Lu
/testsuite/ PR target/119784 * gcc.target/i386/pr119784a.c: New test. * gcc.target/i386/pr119784b.c: Likewise. Signed-off-by: H.J. Lu --- gcc/config/i386/i386.cc | 6 ++ gcc/testsuite/gcc.target/i386/pr119784a.c | 96 +++ gcc/testsuite/

Re: [PATCH v4 1/2] i386: Prefer PLT indirection for __fentry__ calls under -fPIC

2025-04-09 Thread H.J. Lu
On Wed, Apr 9, 2025 at 1:53 AM Ard Biesheuvel wrote: > > From: Ard Biesheuvel > > Commit bde21de1205 ("i386: Honour -mdirect-extern-access when calling > __fentry__") updated the logic that emits mcount() / __fentry__() calls > into function prologues when profiling is enabled, to avoid GOT-based

Re: [PATCH v4 1/2] i386: Prefer PLT indirection for __fentry__ calls under -fPIC

2025-04-09 Thread H.J. Lu
On Wed, Apr 9, 2025 at 8:54 AM Ard Biesheuvel wrote: > > On Wed, 9 Apr 2025 at 16:46, H.J. Lu wrote: > > > > On Wed, Apr 9, 2025 at 1:53 AM Ard Biesheuvel wrote: > > > > > > From: Ard Biesheuvel > > > > > > Commit bde21de1205 ("i386:

Re: [PATCH v3] i386: Prefer PLT indirection for __fentry__ calls under -fPIC

2025-04-08 Thread H.J. Lu
On Tue, Apr 8, 2025 at 9:59 AM Ard Biesheuvel wrote: > > On Tue, 8 Apr 2025 at 18:44, H.J. Lu wrote: > > > > On Tue, Apr 8, 2025 at 9:39 AM Ard Biesheuvel wrote: > > > > > > On Tue, 8 Apr 2025 at 15:33, H.J. Lu wrote: > > > > >

Re: [PATCH v3] i386: Prefer PLT indirection for __fentry__ calls under -fPIC

2025-04-08 Thread H.J. Lu
On Tue, Apr 8, 2025 at 9:39 AM Ard Biesheuvel wrote: > > On Tue, 8 Apr 2025 at 15:33, H.J. Lu wrote: > > > > On Tue, Apr 8, 2025 at 3:46 AM Ard Biesheuvel wrote: > > > > > > From: Ard Biesheuvel > > > > > > Commit bde21de1205 ("i386:

Re: [PATCH v3] i386: Prefer PLT indirection for __fentry__ calls under -fPIC

2025-04-08 Thread H.J. Lu
On Tue, Apr 8, 2025 at 3:46 AM Ard Biesheuvel wrote: > > From: Ard Biesheuvel > > Commit bde21de1205 ("i386: Honour -mdirect-extern-access when calling > __fentry__") updated the logic that emits mcount() / __fentry__() calls > into function prologues when profiling is enabled, to avoid GOT-based

Re: [PATCH v2] i386: Prefer PLT indirection for __fentry__ calls under -fPIC

2025-04-08 Thread H.J. Lu
On Tue, Apr 8, 2025 at 3:15 AM Ard Biesheuvel wrote: > > From: Ard Biesheuvel > > Commit bde21de1205 ("i386: Honour -mdirect-extern-access when calling > __fentry__") updated the logic that emits mcount() / __fentry__() calls > into function prologues when profiling is enabled, to avoid GOT-based

[PATCH v2] x86: Update memcpy/memset inline strategies for -mtune=generic

2025-04-07 Thread H.J. Lu
get/i386/sw-1.c: Also pass -mstringop-strategy=rep_byte. Signed-off-by: H.J. Lu --- gcc/config/i386/x86-tune-costs.h | 31 --- gcc/config/i386/x86-tune.def | 2 +- .../gcc.target/i386/auto-init-padding-3.c | 7 ++--- .../gcc.target/i386/auto

Re: [PATCH] lto: lto-opts fixes [PR119625]

2025-04-07 Thread H.J. Lu
On Mon, Apr 7, 2025 at 2:53 AM Richard Biener wrote: > > On Fri, 4 Apr 2025, Jakub Jelinek wrote: > > > On Fri, Apr 04, 2025 at 08:52:10PM +0200, Richard Biener wrote: > > > > Or do you want something further (like > > > > switch (global_options.x_flag_cf_protection & ~CF_SET) > > > > )? > > > > >

Re: [Patch, fortran] PR119460 - gfortran.dg/reduce_1.f90 FAILs

2025-04-06 Thread H.J. Lu
On Sun, Apr 6, 2025 at 5:39 AM Paul Richard Thomas wrote: > > Hi All, > > As far as I can tell, the attached patch fixes the problems with the reduce > intrinsic. I would be grateful to the reporters if they would confirm that > this is the case. > > The key to the fix appears in reduce_3.f90, w

Re: [PATCH] i386: Prefer PLT indirection for __fentry__ calls under -fPIC

2025-04-06 Thread H.J. Lu
On Sun, Apr 6, 2025 at 8:54 AM H.J. Lu wrote: > > On Fri, Apr 4, 2025 at 12:01 AM Ard Biesheuvel wrote: > > > > From: Ard Biesheuvel > > > > Commit bde21de1205 ("i386: Honour -mdirect-extern-access when calling > > __fentry__") updated the logic tha

Re: [PATCH] i386: Prefer PLT indirection for __fentry__ calls under -fPIC

2025-04-06 Thread H.J. Lu
On Fri, Apr 4, 2025 at 12:01 AM Ard Biesheuvel wrote: > > From: Ard Biesheuvel > > Commit bde21de1205 ("i386: Honour -mdirect-extern-access when calling > __fentry__") updated the logic that emits mcount() / __fentry__() calls > into function prologues when profiling is enabled, to avoid GOT-base

Re: [Patch, fortran] PR85836: Implement the F2018 reduce intrinsic

2025-03-30 Thread H.J. Lu
On Wed, Mar 19, 2025 at 11:23 AM Paul Richard Thomas wrote: > > Hi Andre, > > Thanks for the review - I'll act on the points that you raised. > > The Linaro people reported a failure in reduce_1.f90 execution, which I > believe is due to incorrect casting of 'dim' and a wrong specification of its

[PATCH] gcc.dg/pr90838-2.c: Replace long with long long

2025-03-17 Thread H.J. Lu
Since gcc.dg/pr90838-2.c is only for 64-bit integer, replace long with long long for ILP32 targets. * gcc.dg/pr90838-2.c (ctz4): Replace long with long long. Signed-off-by: H.J. Lu --- gcc/testsuite/gcc.dg/pr90838-2.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff

Re: [Patch] Fortran: Store OpenMP's 'declare variant' in module file [PR115271]

2025-03-16 Thread H.J. Lu
On Sat, Mar 15, 2025 at 3:46 AM Tobias Burnus wrote: > > I wonder why sometimes my line breaks are preserved and at other times all > eaten. > > Next try ... > > Tobias Burnus wrote: > > Hi Thomas, > > Thomas Koenig wrote: > > Just one question - as this will change the module file, will we still

Re: [PATCH] middle-end/119204 - ICE with strcspn folding

2025-03-15 Thread H.J. Lu
On Tue, Mar 11, 2025 at 2:58 AM Richard Biener wrote: > > The following makes sure to convert the folded expression to the > original expression type. > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? > > Thanks, > Richard. > > PR middle-end/119204 > * builtins.cc (fold_

[PATCH stage1 2/3] x86: Add a pass to fold tail call

2025-03-15 Thread H.J. Lu
53-5.c: Likewise. * gcc.target/i386/pr47253-6.c: Likewise. * gcc.target/i386/pr47253-7a.c: Likewise. * gcc.target/i386/pr47253-7b.c: Likewise. * gcc.target/i386/pr47253-8.c: Likewise. Signed-off-by: H.J. Lu --- gcc/config/i386/i386-features.cc | 204

[PATCH stage1 6/6] ssa-fre-4.c: Enable for all targets and adjust scan match

2025-03-14 Thread H.J. Lu
Since the C frontend no longer promotes char argument, enable ssa-fre-4.c for all targets and adjust scan match. PR middle-end/112877 * gcc.dg/tree-ssa/ssa-fre-4.c: Enable for all targets and adjust scan match. Signed-off-by: H.J. Lu --- gcc/testsuite/gcc.dg/tree-ssa

[PATCH stage1 2/6] Drop targetm.promote_prototypes from C, C++ and Ada frontends

2025-03-14 Thread H.J. Lu
/112877 * call.cc (type_passed_as): Remove the targetm.calls.promote_prototypes call. (convert_for_arg_passing): Likewise. * typeck.cc (cxx_safe_arg_type_equiv_p): Likewise. Signed-off-by: H.J. Lu --- gcc/ada/gcc-interface/utils.cc | 24 gcc/c/c

[PATCH stage1 1/6] Honor TARGET_PROMOTE_PROTOTYPES during RTL expand

2025-03-14 Thread H.J. Lu
/112877 * gfortran.dg/pr112877-1.f90: New test. Signed-off-by: H.J. Lu --- gcc/calls.cc | 9 + gcc/testsuite/gfortran.dg/pr112877-1.f90 | 17 + 2 files changed, 26 insertions(+) create mode 100644 gcc/testsuite/gfortran.dg/pr112877-1.f90

  1   2   3   4   5   6   7   8   9   10   >