Re: [PATCH v10 02/13] fmv: Refactor FMV name mangling.

2025-09-07 Thread Uros Bizjak
On Sun, Sep 7, 2025 at 11:24 PM Alfie Richards wrote: > > The 09/07/2025 12:41, Jeff Law wrote: > > > > > > On 8/28/25 3:49 AM, alfie.richa...@arm.com wrote: > > > From: Alfie Richards > > > > > > This patch is an overhaul of how FMV name mangling works. Previously > > > mangling logic was duplic

Re: [PATCH] x86: Allow by_pieces op when expanding memcpy/memset epilogue

2025-08-28 Thread Uros Bizjak
On Fri, Aug 29, 2025 at 2:55 AM H.J. Lu wrote: > > Since > > commit 401199377c50045ede560daf3f6e8b51749c2a87 > Author: H.J. Lu > Date: Tue Jun 17 10:17:17 2025 +0800 > > x86: Improve vector_loop/unrolled_loop for memset/memcpy > > uses move_by_pieces and store_by_pieces to expand memcpy/mem

Re: [PATCH] i386: wire up --with-tls to control -mtls-dialect= default

2025-08-23 Thread Uros Bizjak
On Sat, Aug 23, 2025 at 11:13 AM Uros Bizjak wrote: > > On Sat, Aug 23, 2025 at 2:42 AM Sam James wrote: > > > > Allow passing --with-tls= at configure-time to control the default value > > of -mtls-dialect= for i386 and x86_64. The default itself (gnu) is not > >

Re: [PATCH] i386: wire up --with-tls to control -mtls-dialect= default

2025-08-23 Thread Uros Bizjak
On Sat, Aug 23, 2025 at 2:42 AM Sam James wrote: > > Allow passing --with-tls= at configure-time to control the default value > of -mtls-dialect= for i386 and x86_64. The default itself (gnu) is not changed > unless --with-tls= is passed. > > --with-tls= is already wired up for ARM and RISC-V. > >

Re: [PATCH v2] x86: Add target("80387") function attribute

2025-08-14 Thread Uros Bizjak
On Fri, Aug 15, 2025 at 4:04 AM H.J. Lu wrote: > > Add target("80387") attribute to enable and disable x87 instructions in a > function. > > gcc/ > > PR target/121541 > * config/i386/i386-options.cc > (ix86_valid_target_attribute_inner_p): Add target("80387") > attr

Re: [PATCH] x86: Add target("80387") function attribute

2025-08-13 Thread Uros Bizjak
On Thu, Aug 14, 2025 at 6:58 AM H.J. Lu wrote: > > Add target("80387") attribute to enable and disable x87 instructions in a > function. > > gcc/ > > PR target/121541 > * config/i386/i386-options.cc > (ix86_valid_target_attribute_inner_p): Add a bool argument to > r

Re: [PATCH v2] x86: Disallow -mtls-dialect=gnu with no_caller_saved_registers

2025-08-13 Thread Uros Bizjak
On Wed, Aug 13, 2025 at 3:44 PM H.J. Lu wrote: > > On Mon, Jul 28, 2025 at 1:29 AM Uros Bizjak wrote: > > > > On Sat, Jul 26, 2025 at 7:37 PM H.J. Lu wrote: > > > > > > __tls_get_addr doesn't preserve vector registers. When a function > > &

Re: Ping: [PATCH v2] testsuite: i386: Fix gcc.target/i386/pr90579.c when PIE is enabled [PR118885]

2025-08-10 Thread Uros Bizjak
On Sat, Aug 9, 2025 at 11:04 AM Xi Ruoyao wrote: > > Ping. > > On Wed, 2025-07-30 at 04:36 -0700, harish.sadin...@windriver.com wrote: > > From: Harish Sadineni > > > > When gcc build with --enable-deafult-pie the following tests > > were getting failed: > > FAIL: gcc.target/i386/pr90579.c scan-

Re: [PATCH 1/5] asm-hard-reg-1.c: Adjust scan for x86 with ia32, x32 and lp64

2025-08-10 Thread Uros Bizjak
On Sun, Aug 10, 2025 at 12:02 AM H.J. Lu wrote: > > Since i?86 and x86_64 GCC can generate codes for ia32, x32 and lp64, adjust > asm-hard-reg-1.c scan for x86 with ia32, x32 and lp64. > > PR testsuite/121205 > * gcc.dg/asm-hard-reg-1.c: Adjust scan for x86 with ia32, x32 and >

Re: [PATCH] x86: Use sol2 linker emulation only for Solaris 2

2025-08-07 Thread Uros Bizjak
On Fri, Aug 8, 2025 at 6:26 AM H.J. Lu wrote: > > When GNU binutils is configured with --enable-targets=all on Linux, > "ld -V" will report both elf_x86_64_sol2 and elf_i386_sol2 as supported > emulations. But they should only be used for Solaris 2 targets. Check > for Solaris 2 targets before u

Re: [PATCH] x86: Add *one_cmplqi_ext_2

2025-08-06 Thread Uros Bizjak
On Tue, Aug 5, 2025 at 1:32 PM Richard Sandiford wrote: > > Richard Sandiford writes: > > "H.J. Lu" writes: > >> On Mon, Aug 4, 2025 at 3:28 PM H.J. Lu wrote: > >>> > >>> On Mon, Aug 4, 2025 at 2:04 PM H.J. Lu wrote: > >>> > > >>> > On Mon, Aug 4, 2025 at 8:50 AM Richard Sandiford > >>> > wro

Re: [PATCH v3] i386: Add missing PTA_POPCNT and PTA_LZCNT with PTA_ABM

2025-08-06 Thread Uros Bizjak
er today. Thanks, Uros. > > Thanks, > Yangyu Chen > > > On 6 Aug 2025, at 16:13, Uros Bizjak wrote: > > > > On Wed, Jul 30, 2025 at 7:24 PM Yangyu Chen wrote: > >> > >> This patch adds the missing PTA_POPCNT and PTA_LZCNT with the PTA_ABM > >&g

Re: [PATCH v3] i386: Add missing PTA_POPCNT and PTA_LZCNT with PTA_ABM

2025-08-06 Thread Uros Bizjak
On Wed, Jul 30, 2025 at 7:24 PM Yangyu Chen wrote: > > This patch adds the missing PTA_POPCNT and PTA_LZCNT with the PTA_ABM > bitmask definition for the bdver1, btver1, and lujiazui architectures > in the i386 architecture configuration file. > > Although these two features were not present in th

Re: [PATCH] x86: Get the widest vector mode from STORE_MAX_PIECES for memset

2025-08-05 Thread Uros Bizjak
On Tue, Aug 5, 2025 at 3:32 PM H.J. Lu wrote: > > commit 050b1708ea532ea4840e97d85fad4ca63d4cd631 > Author: H.J. Lu > Date: Thu Jun 19 05:03:48 2025 +0800 > > x86: Get the widest vector mode from MOVE_MAX > > gets the widest vector mode from MOVE_MAX. But for memset, it should > use STORE_

Re: [PATCH] i386: Extend recognition of high-reg rvalues [PR121306]

2025-08-05 Thread Uros Bizjak
On Tue, Aug 5, 2025 at 12:19 PM Richard Sandiford wrote: > > The i386 high-register patterns used things like: > > (match_operator:SWI248 2 "extract_operator" > [(match_operand 0 "int248_register_operand" "Q") >(const_int 8) >(const_int 8)]) > > to match an extraction of

Re: [PATCH] x86: Add *one_cmplqi_ext_2

2025-08-04 Thread Uros Bizjak
On Mon, Aug 4, 2025 at 11:05 PM H.J. Lu wrote: > > On Mon, Aug 4, 2025 at 8:50 AM Richard Sandiford > wrote: > > > > Uros Bizjak writes: > > > On Sat, Aug 2, 2025 at 8:56 PM H.J. Lu wrote: > > >> > > >> On Fri, Aug 1, 2025 at 10:32 PM Uro

Re: [PATCH] x86: Add *one_cmplqi_ext_2

2025-08-03 Thread Uros Bizjak
On Sat, Aug 2, 2025 at 8:56 PM H.J. Lu wrote: > > On Fri, Aug 1, 2025 at 10:32 PM Uros Bizjak wrote: > > > > On Sat, Aug 2, 2025 at 3:22 AM H.J. Lu wrote: > > > > > > After > > > > > > commit 965564eafb721f813a3112f1bba8d8fae32b > &

Re: [PATCH] x86: Add *one_cmplqi_ext_2

2025-08-01 Thread Uros Bizjak
On Sat, Aug 2, 2025 at 3:22 AM H.J. Lu wrote: > > After > > commit 965564eafb721f813a3112f1bba8d8fae32b > Author: Richard Sandiford > Date: Tue Jul 29 15:58:34 2025 +0100 > > simplify-rtx: Simplify subregs of logic ops > > combine generates > > (set (zero_extract:SI (reg/v:SI 101 [ a ])

Re: [PATCH] Eliminate redundant vpextrq/vpinsrq when move TI to V4SI.

2025-07-30 Thread Uros Bizjak
On Wed, Jul 30, 2025 at 3:41 AM liuhongt wrote: > > r14-1902-g96c3539f2a3813 split TImode move with 2 DImode move, it's > supposed to optimize TImode in parameter/return since accoring to > psABI it's stored into 2 general registers. > > But when TImode is not in parameter/return, it could create

Re: [PATCH 1/4] i386: Ignore regparm attribute and warn for it in 64-bit mode

2025-07-29 Thread Uros Bizjak
On Tue, Jul 29, 2025 at 6:58 PM Uros Bizjak wrote: > > On Tue, Jul 29, 2025 at 5:04 PM wrote: > > > > On 2025-07-25 11:18, Uros Bizjak wrote: > > > On Thu, Jul 24, 2025 at 5:35 PM Artemiy Granat > > > wrote: > > >> > > >> gcc/testsu

Re: [PATCH 1/4] i386: Ignore regparm attribute and warn for it in 64-bit mode

2025-07-29 Thread Uros Bizjak
On Tue, Jul 29, 2025 at 5:04 PM wrote: > > On 2025-07-25 11:18, Uros Bizjak wrote: > > On Thu, Jul 24, 2025 at 5:35 PM Artemiy Granat > > wrote: > >> > >> gcc/testsuite/ChangeLog: > >> > >> * g++.dg/abi/regparm1.C: Use reg

Re: [PATCH v2] x86: Disallow -mtls-dialect=gnu with no_caller_saved_registers

2025-07-28 Thread Uros Bizjak
On Sat, Jul 26, 2025 at 7:37 PM H.J. Lu wrote: > > __tls_get_addr doesn't preserve vector registers. When a function > with no_caller_saved_registers attribute calls __tls_get_addr, YMM > and ZMM registers will be clobbered. Issue an error and suggest > -mtls-dialect=gnu2 in this case. > > gcc/

Re: [PATCH v2] x86: Disallow -mtls-dialect=gnu with no_caller_saved_registers

2025-07-28 Thread Uros Bizjak
On Sat, Jul 26, 2025 at 7:37 PM H.J. Lu wrote: > > __tls_get_addr doesn't preserve vector registers. When a function > with no_caller_saved_registers attribute calls __tls_get_addr, YMM > and ZMM registers will be clobbered. Issue an error and suggest > -mtls-dialect=gnu2 in this case. > > gcc/

Re: [PATCH 4/4] i386: Fix typo in diagnostic about simultaneous regparm and thiscall use

2025-07-25 Thread Uros Bizjak
On Thu, Jul 24, 2025 at 5:35 PM Artemiy Granat wrote: > > gcc/ChangeLog: > > * config/i386/i386-options.cc (ix86_handle_cconv_attribute): > Fix typo. OK, and also obvious. Thanks, Uros. > --- > gcc/config/i386/i386-options.cc | 2 +- > 1 file changed, 1 insertion(+), 1 deletion

Re: [PATCH 1/4] i386: Ignore regparm attribute and warn for it in 64-bit mode

2025-07-25 Thread Uros Bizjak
On Thu, Jul 24, 2025 at 5:35 PM Artemiy Granat wrote: > > The regparm attribute does not affect code generation on x86-64 target. > Despite this, regparm was accepted silently, unlike other calling > convention attributes handled in the ix86_handle_cconv_attribute > function. > > Due to lack of di

Re: [PATCH 3/4] i386: Fix incorrect handling of simultaneous regparm and thiscall use

2025-07-25 Thread Uros Bizjak
On Thu, Jul 24, 2025 at 5:35 PM Artemiy Granat wrote: > > gcc/ChangeLog: > > * config/i386/i386-options.cc (ix86_handle_cconv_attribute): > Handle simultaneous use of regparm and thiscall attributes in > case when regparm is set before thiscall. > > gcc/testsuite/ChangeLog:

Re: [PATCH 2/4] i386: Fix incorrect comment about stdcall and fastcall compatibility

2025-07-25 Thread Uros Bizjak
On Thu, Jul 24, 2025 at 5:35 PM Artemiy Granat wrote: > > gcc/ChangeLog: > > * config/i386/i386-options.cc (ix86_handle_cconv_attribute): > Fix comments which state that combination of stdcall and fastcall > attributes is valid but redundant. OK as an obvious patch. Thank

Re: [PATCH] x86: Disallow -mtls-dialect=gnu with no_caller_saved_registers

2025-07-25 Thread Uros Bizjak
On Thu, Jul 24, 2025 at 9:30 PM H.J. Lu wrote: > > On x86-64, __tls_get_addr is a normal function which doesn't preserve > vector registers. On i386, ___tls_get_addr preserve vector registers > only with the commit: Can you please rephrase the above part? What does it mean to be a normal functio

[pushed] i386: Use various predicates instead of open coding them

2025-07-16 Thread Uros Bizjak
No functional changes. [patch 1/4] gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_vector_logical_operator): Use CONST_VECTOR_P instead of open coding it. (ix86_expand_int_sse_cmp): Ditto. (ix86_extract_perm_from_pool_constant): Ditto. (ix86_split_to_parts): Ditto.

Re: [PATCH v2] x86: Warn -pg without -mfentry only on glibc targets

2025-07-16 Thread Uros Bizjak
On Wed, Jul 16, 2025 at 5:15 PM H.J. Lu wrote: > > Since only glibc targets support -mfentry, warn -pg without -mfentry only > on glibc targets. > > gcc/ > > PR target/120881 > PR testsuite/121078 > * config/i386/i386-options.cc (ix86_option_override_internal): > Wa

Re: [PATCH v2] gcc-16/changes.html: Add --enable-x86-64-mfentry

2025-07-14 Thread Uros Bizjak
On Mon, Jul 14, 2025 at 9:39 PM H.J. Lu wrote: > > > OK to install? > > > > This should at least say that the new option is enabled by default > > with glibc targets. > > > > Uros. > > Like this? LGTM for content, but let's ask Gerald to proofread the entry. Thanks, Uros.

Re: [PATCH] x86: Convert MMX integer loads from constant vector pool

2025-07-14 Thread Uros Bizjak
On Tue, Jul 15, 2025 at 3:43 AM H.J. Lu wrote: > > For MMX 16-bit, 32-bit and 64-bit constant vector loads from constant > vector pool: > > (insn 6 2 7 2 (set (reg:V1SI 5 di) > (mem/u/c:V1SI (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0 S4 A32])) > "pr1 > 21062-2.c":10:3 2036 {*movv1si_inte

Re: [PATCH] gcc-16/changes.html: Add --enable-x86-64-mfentry

2025-07-14 Thread Uros Bizjak
On Mon, Jul 14, 2025 at 2:34 PM H.J. Lu wrote: > > OK to install? This should at least say that the new option is enabled by default with glibc targets. Uros.

Re: [PATCH v2] x86-64: Add --enable-x86-64-mfentry

2025-07-14 Thread Uros Bizjak
On Mon, Jul 14, 2025 at 1:16 PM Uros Bizjak wrote: > > On Sun, Jul 13, 2025 at 12:16 AM Sam James wrote: > > > > "H.J. Lu" writes: > > > > > On Sat, Jul 12, 2025 at 6:58 AM Siddhesh Poyarekar > > > wrote: > > >> > > >>

Re: [PATCH v2] x86-64: Add --enable-x86-64-mfentry

2025-07-14 Thread Uros Bizjak
On Sun, Jul 13, 2025 at 12:16 AM Sam James wrote: > > "H.J. Lu" writes: > > > On Sat, Jul 12, 2025 at 6:58 AM Siddhesh Poyarekar > > wrote: > >> > >> On 2025-07-11 15:28, Uros Bizjak wrote: > >> >> Why not just switch over uncondi

Re: [PATCH v5] x86: Check all 0s/1s vectors with standard_sse_constant_

2025-07-14 Thread Uros Bizjak
On Mon, Jul 14, 2025 at 11:25 AM H.J. Lu wrote: > > On Mon, Jul 14, 2025 at 4:06 PM Uros Bizjak wrote: > > > > On Mon, Jul 14, 2025 at 9:37 AM H.J. Lu wrote: > > > > > > On Mon, Jul 14, 2025 at 3:11 PM Uros Bizjak wrote: > > > > > >

Re: [PATCH v3] x86: Update MMX moves to support all 1s vectors

2025-07-14 Thread Uros Bizjak
On Mon, Jul 14, 2025 at 9:37 AM H.J. Lu wrote: > > On Mon, Jul 14, 2025 at 3:11 PM Uros Bizjak wrote: > > > > On Mon, Jul 14, 2025 at 5:32 AM Uros Bizjak wrote: > > > > > > On Mon, Jul 14, 2025 at 2:14 AM H.J. Lu wrote: > > > > > >

Re: [PATCH v3] x86: Update MMX moves to support all 1s vectors

2025-07-14 Thread Uros Bizjak
On Mon, Jul 14, 2025 at 9:41 AM H.J. Lu wrote: > > On Mon, Jul 14, 2025 at 3:34 PM Uros Bizjak wrote: > > > > On Mon, Jul 14, 2025 at 9:11 AM Uros Bizjak wrote: > > > > > > On Mon, Jul 14, 2025 at 5:32 AM Uros Bizjak wrote: > > > > > &

Re: [PATCH v3] x86: Update MMX moves to support all 1s vectors

2025-07-14 Thread Uros Bizjak
On Mon, Jul 14, 2025 at 9:11 AM Uros Bizjak wrote: > > On Mon, Jul 14, 2025 at 5:32 AM Uros Bizjak wrote: > > > > On Mon, Jul 14, 2025 at 2:14 AM H.J. Lu wrote: > > > > > > On Sat, Jul 12, 2025 at 7:51 PM Uros Bizjak wrote: > > > > > &

Re: [PATCH v3] x86: Update MMX moves to support all 1s vectors

2025-07-14 Thread Uros Bizjak
On Mon, Jul 14, 2025 at 5:32 AM Uros Bizjak wrote: > > On Mon, Jul 14, 2025 at 2:14 AM H.J. Lu wrote: > > > > On Sat, Jul 12, 2025 at 7:51 PM Uros Bizjak wrote: > > > > > > On Sat, Jul 12, 2025 at 1:41 PM H.J. Lu wrote: > > > > > >

Re: [PATCH v3] x86: Update MMX moves to support all 1s vectors

2025-07-13 Thread Uros Bizjak
On Mon, Jul 14, 2025 at 2:14 AM H.J. Lu wrote: > > On Sat, Jul 12, 2025 at 7:51 PM Uros Bizjak wrote: > > > > On Sat, Jul 12, 2025 at 1:41 PM H.J. Lu wrote: > > > > > > On Sat, Jul 12, 2025 at 5:58 PM Uros Bizjak wrote: > > > > > &g

[pushed] i386: Robustify MMX move patterns

2025-07-12 Thread Uros Bizjak
MMX allows only direct moves from zero, so correct V_32:mode and v2qi move patterns to allow only nonimm_or_0_operand as their input operand. gcc/ChangeLog: * config/i386/mmx.md (mov): Use nonimm_or_0_operand predicate for operand 1. (*mov_internal): Ditto. (movv2qi): Ditto. (

Re: [PATCH v2] x86: Update MMXMODE:*mov_internal to support all 1s vectors

2025-07-12 Thread Uros Bizjak
On Sat, Jul 12, 2025 at 1:41 PM H.J. Lu wrote: > > On Sat, Jul 12, 2025 at 5:58 PM Uros Bizjak wrote: > > > > On Sat, Jul 12, 2025 at 11:52 AM H.J. Lu wrote: > > > > > > On Sat, Jul 12, 2025 at 5:03 PM Uros Bizjak wrote: > > > > > &

Re: [PATCH v2] x86: Update MMXMODE:*mov_internal to support all 1s vectors

2025-07-12 Thread Uros Bizjak
On Sat, Jul 12, 2025 at 11:52 AM H.J. Lu wrote: > > On Sat, Jul 12, 2025 at 5:03 PM Uros Bizjak wrote: > > > > On Fri, Jul 11, 2025 at 6:05 AM H.J. Lu wrote: > > > > > > commit 77473a27bae04da99d6979d43e7bd0a8106f4557 > > > Author: H.J. Lu

Re: [PATCH v2] x86: Update MMXMODE:*mov_internal to support all 1s vectors

2025-07-12 Thread Uros Bizjak
On Fri, Jul 11, 2025 at 6:05 AM H.J. Lu wrote: > > commit 77473a27bae04da99d6979d43e7bd0a8106f4557 > Author: H.J. Lu > Date: Thu Jun 26 06:08:51 2025 +0800 > > x86: Also handle all 1s float vector constant > > replaces > > (insn 29 28 30 5 (set (reg:V2SF 107) > (mem/u/c:V2SF (symbol

Re: [PATCH] x86-64: Add --enable-x86-64-mfentry

2025-07-11 Thread Uros Bizjak
On Fri, Jul 11, 2025 at 2:33 PM Siddhesh Poyarekar wrote: > > On 2025-07-08 18:07, Sam James wrote: > >> OK in principle, but please allow some time for distro maintainers > >> (CC'd) to voice their opinion. > > > > It looks good to me and I plan on us using it. I'd like opinions from > > one othe

Re: [PATCH v2] x86: Update MMXMODE:*mov_internal to support all 1s vectors

2025-07-11 Thread Uros Bizjak
On Fri, Jul 11, 2025 at 10:39 AM H.J. Lu wrote: > > On Fri, Jul 11, 2025 at 4:23 PM Uros Bizjak wrote: > > > > On Fri, Jul 11, 2025 at 9:57 AM Uros Bizjak wrote: > > > > > > On Fri, Jul 11, 2025 at 6:05 AM H.J. Lu wrote: > > > > > > > gc

Re: [PATCH v2] x86: Update MMXMODE:*mov_internal to support all 1s vectors

2025-07-11 Thread Uros Bizjak
On Fri, Jul 11, 2025 at 9:57 AM Uros Bizjak wrote: > > On Fri, Jul 11, 2025 at 6:05 AM H.J. Lu wrote: > > > gcc/ > > > > PR target/121015 > > * config/i386/constraints.md (BX): New constraint. > > * config/i386/i386.cc (ix86_print_operand): Support CONSTM1

Re: [PATCH v2] x86: Update MMXMODE:*mov_internal to support all 1s vectors

2025-07-11 Thread Uros Bizjak
On Fri, Jul 11, 2025 at 6:05 AM H.J. Lu wrote: > gcc/ > > PR target/121015 > * config/i386/constraints.md (BX): New constraint. > * config/i386/i386.cc (ix86_print_operand): Support CONSTM1_RTX. > * config/i386/mmx.md (MMXMODE:*mov_internal): Replace C with > BX for memory and integer register de

Re: [PATCH] x86: Update "*mov_internal" in mmx.md to handle all 1s vectors

2025-07-10 Thread Uros Bizjak
On Thu, Jul 10, 2025 at 2:31 PM Uros Bizjak wrote: > > On Thu, Jul 10, 2025 at 1:57 PM H.J. Lu wrote: > > > > commit 77473a27bae04da99d6979d43e7bd0a8106f4557 > > Author: H.J. Lu > > Date: Thu Jun 26 06:08:51 2025 +0800 > > > > x86: Also handle all

Re: [PATCH] x86: Update "*mov_internal" in mmx.md to handle all 1s vectors

2025-07-10 Thread Uros Bizjak
On Thu, Jul 10, 2025 at 1:57 PM H.J. Lu wrote: > > commit 77473a27bae04da99d6979d43e7bd0a8106f4557 > Author: H.J. Lu > Date: Thu Jun 26 06:08:51 2025 +0800 > > x86: Also handle all 1s float vector constant > > replaces > > (insn 29 28 30 5 (set (reg:V2SF 107) > (mem/u/c:V2SF (symbol

Re: [PATCH] x86-64: Add --enable-x86-64-mfentry

2025-07-03 Thread Uros Bizjak
On Thu, Jul 3, 2025 at 12:01 PM H.J. Lu wrote: > > When profiling is enabled with shrink wrapping, the mcount call may not > be placed at the function entry after > > pushq %rbp > movq %rsp,%rbp > > As the result, the profile data may be skewed which makes PGO less > effective. > > Add --enable-x8

Re: [PATCH] x86: Emit label only for __mcount_loc section

2025-07-03 Thread Uros Bizjak
On Thu, Jul 3, 2025 at 11:54 AM H.J. Lu wrote: > > commit ecc81e33123d7ac9c11742161e128858d844b99d (HEAD) > Author: Andi Kleen > Date: Fri Sep 26 04:06:40 2014 + > > Add direct support for Linux kernel __fentry__ patching > > emitted a label, 1, for __mcount_loc section: > > 1: call mco

Re: [PATCH] x86-64: Add RDI clobber to tls_local_dynamic_64 patterns

2025-07-02 Thread Uros Bizjak
On Thu, Jul 3, 2025 at 6:32 AM H.J. Lu wrote: > > *tls_local_dynamic_64_ uses RDI as the __tls_get_addr argument. > Add RDI clobber to tls_local_dynamic_64 patterns to show it. > > PR target/120908 > * config/i386/i386.cc (legitimize_tls_address): Pass RDI to > gen_tls_local_dynamic_64. > * config

Re: [PATCH] x86-64: Add RDI clobber to tls_global_dynamic_64 patterns

2025-07-02 Thread Uros Bizjak
On Wed, Jul 2, 2025 at 2:43 PM H.J. Lu wrote: > > *tls_global_dynamic_64_ uses RDI as the __tls_get_addr argument. > Add RDI clobber to tls_global_dynamic_64 patterns to show it. > > PR target/120908 > * config/i386/i386.cc (legitimize_tls_address): Pass RDI to > gen_tls_global_dynamic_64. > * con

Re: [committed] i386: Introduce crc_revsi4 expanders [PR120719]

2025-06-26 Thread Uros Bizjak
On Fri, Jun 27, 2025 at 7:27 AM Andi Kleen wrote: > > Uros Bizjak writes: > > > Introduce crc_revsi4 expanders to generate CRC32 instruction when > > using > > __builtin_rev_crc32_data* builtins with 0x1EDC6F41 poylnomial and -mcrc32. > > > >

[committed] i386: Introduce crc_revsi4 expanders [PR120719]

2025-06-26 Thread Uros Bizjak
Introduce crc_revsi4 expanders to generate CRC32 instruction when using __builtin_rev_crc32_data* builtins with 0x1EDC6F41 poylnomial and -mcrc32. PR target/120719 gcc/ChangeLog: * config/i386/i386.md (crc_revsi4): New expander. gcc/testsuite/ChangeLog: * gcc.target/i386/crc-builti

Re: [PATCH] Fix shrink wrap separate ICE for mingw [PR120741]

2025-06-25 Thread Uros Bizjak
On Tue, Jun 24, 2025 at 4:54 AM Cui, Lili wrote: > > > > From: Lili Cui > > > > > > > > Hi Uros, > > > > > > > > I need to remove another assertion in the shrink wrap separate patch. > > > Added two cases for changing the CHECK_STACK_LIMIT value. > > > > > > > > The default values for CHECK_STAC

[committed] i386: Convert LEA stack adjust insn to SUB when FLAGS_REG is dead

2025-06-24 Thread Uros Bizjak
ADD/SUB is faster than LEA for most processors. Also, there are several peephole2 patterns available that convert prologue esp subtractions to pushes (at the end of i386.md). These process only patterns with flags reg clobber, so they are ineffective with clobber-less stack ptr adjustments, introdu

Re: [PATCH v3] x86: Update memcpy/memset inline strategies for -mtune=generic

2025-06-24 Thread Uros Bizjak
On Tue, Jun 24, 2025 at 5:22 AM Hongtao Liu wrote: > > > > Ideall we should catch repeated constants more generally since > > > > this appears elsewhere too. > > > > I am not quite sure where to fit it best. We already have a > > > > machine specific task that loads 0 into

Re: [PATCH] Fix shrink wrap separate ICE for mingw [PR120741]

2025-06-23 Thread Uros Bizjak
On Mon, Jun 23, 2025 at 1:19 PM Cui, Lili wrote: > > From: Lili Cui > > Hi Uros, > > I need to remove another assertion in the shrink wrap separate patch. Added > two cases for changing the CHECK_STACK_LIMIT value. > > The default values for CHECK_STACK_LIMIT for target wingw and option > -msta

Re: [PATCH] x86: Get the widest vector mode from MOVE_MAX

2025-06-20 Thread Uros Bizjak
On Thu, Jun 19, 2025 at 1:27 PM H.J. Lu wrote: > > Since MOVE_MAX defines the maximum number of bytes that an instruction > can move quickly between memory and registers, use it to get the widest > vector mode in vector loop when inlining memcpy and memset. > > gcc/ > > PR target/120708 > * config

Re: [PATCH] x86: Fix shrink wrap separate ICE under -fstack-clash-protection [PR120697]

2025-06-19 Thread Uros Bizjak
On Thu, Jun 19, 2025 at 9:37 AM Uros Bizjak wrote: > > On Wed, Jun 18, 2025 at 4:12 PM Cui, Lili wrote: > > > > > > > > > -Original Message- > > > From: Uros Bizjak > > > Sent: Wednesday, June 18, 2025 9:22 PM > > > To

Re: [PATCH v4] x86: Enable *mov_(and|or) only for -Oz

2025-06-19 Thread Uros Bizjak
On Thu, Jun 19, 2025 at 9:01 AM Hongtao Liu wrote: > > On Wed, Jun 18, 2025 at 6:38 PM H.J. Lu wrote: > > > > commit ef26c151c14a87177d46fd3d725e7f82e040e89f > > Author: Roger Sayle > > Date: Thu Dec 23 12:33:07 2021 + > > > > x86: PR target/103773: Fix wrong-code with -Oz from pop to

Re: [PATCH] x86: Fix shrink wrap separate ICE under -fstack-clash-protection [PR120697]

2025-06-19 Thread Uros Bizjak
On Wed, Jun 18, 2025 at 4:12 PM Cui, Lili wrote: > > > > > -Original Message- > > From: Uros Bizjak > > Sent: Wednesday, June 18, 2025 9:22 PM > > To: Cui, Lili > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ; > > hongjiu...@intel.com

Re: [PATCH] x86: Fix shrink wrap separate ICE under -fstack-clash-protection [PR120697]

2025-06-18 Thread Uros Bizjak
On Wed, Jun 18, 2025 at 3:11 PM Cui, Lili wrote: > > From: Lili Cui > > Hi Uros, > > An assertion I added in shrink wrap separate V2 reports ICE when > -fstack-clash-protection is enabled. The assertion should not be added here. > > I created a patch to remove 3 assertions and their associated c

Re: [PATCH V3] x86: Enable separate shrink wrapping

2025-06-17 Thread Uros Bizjak
On Tue, Jun 17, 2025 at 4:03 PM Cui, Lili wrote: > > From: Lili Cui > > Hi Uros, > > This is patch v3, the main changes are as follows. > > 1. Added a pro_epilogue_adjust_stack_add_nocc in i386.md to add memory > clobber for lea/mov. > 2. Adjusted some formatting issues. > 3. Added scan-rtl-dump

Re: [PATCH v2] x86: Update memcpy/memset inline strategies for -mtune=generic

2025-06-15 Thread Uros Bizjak
On Fri, Jun 13, 2025 at 3:15 PM Cui, Lili wrote: > > > On Mon, Apr 21, 2025 at 7:24 AM H.J. Lu wrote: > > > > > > > > On Sun, Apr 20, 2025 at 6:31 PM Jan Hubicka wrote: > > > > > > > > > > > PR target/102294 > > > > > > PR target/119596 > > > > > > * config/i386/x86-tune-costs

[committed] i386: Fix signed integer overflow in ix86_expand_int_movcc, part 2 [PR120604]

2025-06-12 Thread Uros Bizjak
Make sure we can represent the difference between two 64-bit DImode immediate values in 64-bit HOST_WIDE_INT and return false if this is not the case. ix86_expand_int_movcc is used in movcc expadner. Expander will FAIL when the function returns false and middle-end will retry expansion with value

Re: [PATCH V2] x86: Enable separate shrink wrapping

2025-06-12 Thread Uros Bizjak
On Thu, Jun 12, 2025 at 10:58 AM Uros Bizjak wrote: > > On Thu, Jun 12, 2025 at 9:26 AM Cui, Lili wrote: > > > > > > @@ -7753,8 +7762,12 @@ pro_epilogue_adjust_stack (rtx dest, rtx src, > > > > rtx > > > offset, > > > > add_frame_

Re: [PATCH V2] x86: Enable separate shrink wrapping

2025-06-12 Thread Uros Bizjak
On Thu, Jun 12, 2025 at 9:26 AM Cui, Lili wrote: > > > > @@ -7753,8 +7762,12 @@ pro_epilogue_adjust_stack (rtx dest, rtx src, > > > rtx > > offset, > > > add_frame_related_expr = true; > > > } > > > > > > + if (crtl->shrink_wrapped_separate) insn = emit_insn (gen_rtx_SET > > > + (d

[committed] i386: Fix signed integer overflow in ix86_expand_int_movcc [PR120604]

2025-06-11 Thread Uros Bizjak
Patch for PR120553 enabled full 64-bit DImode immediates in ix86_expand_int_movcc. However, the function calculates the difference between two immediate arguments using signed 64-bit HOST_WIDE_INT subtractions that can cause signed integer overflow. Avoid the overflow by casting operands of subtr

Re: [PATCH V2] x86: Enable separate shrink wrapping

2025-06-11 Thread Uros Bizjak
On Wed, Jun 11, 2025 at 5:33 AM Cui, Lili wrote: > > From: Lili Cui > > Hi Uros, > > Thank you very much for providing detailed BKM to reproduce Linux kernel boot > failure. My patch and Matz's patch have this problem. We inserted a SUB > between TEST and JLE, and the SUB changes the value of

Re: [PATCH] i386: Handle ZERO_EXTEND like SIGN_EXTEND in bsr patterns [PR120434]

2025-06-09 Thread Uros Bizjak
On Fri, Jun 6, 2025 at 3:43 PM Jakub Jelinek wrote: > > Hi! > > The just posted second PR120434 patch causes > +FAIL: gcc.target/i386/pr78103-3.c scan-assembler m(leaq|addq|incq)M > +FAIL: gcc.target/i386/pr78103-3.c scan-assembler-not mmovlM+ > +FAIL: gcc.target/i386/pr78103-3.c s

[PATCH] i386: Improve "movcc" expander for DImode immediates [PR120553]

2025-06-05 Thread Uros Bizjak
"movcc" expander uses x86_64_general_operand predicate that limits the range of immediate operands to 32-bit size. The usage of this predicate causes ifcvt to force out-of-range immediates to registers when converting through noce_try_cmove. The testcase: long long foo (long long c) { return c >

Re: [PATCH] i386: Fix vmovvdup's mem attribute

2025-06-04 Thread Uros Bizjak
On Thu, Jun 5, 2025 at 3:29 AM Hu, Lin1 wrote: > > Hi, > > Some vmovvdup pattern's type attribute is sselog1 and then mem attribute is > both. Modify type attribute according to other patterns about vmovvdup. > > Bootstrapped and regtested on x86_64-linux-pc-gnu, OK for trunk? OK. Thanks, Uros.

[PATCH] rtl-optimization: Invalid CSE of inline asm with memory clobber [PR111901]

2025-05-29 Thread Uros Bizjak
The following test: --cut here-- int test (void) { unsigned int sum = 0; for (int i = 0; i < 4; i++) { unsigned int val; asm ("magic %0" : "=r" (val) : : "memory"); sum += val; } return sum; } --cut here-- compiles on x86_64 with -O2 -funroll-all-loops to nonsen

Re: [PATCH] i386, v2: Extend *cmp_minus_1 optimizations also to plus with CONST_INT [PR120360]

2025-05-21 Thread Uros Bizjak
On Wed, May 21, 2025 at 1:20 PM Jakub Jelinek wrote: > > On Wed, May 21, 2025 at 11:48:34AM +0200, Uros Bizjak wrote: > > Please introduce "x86_64_neg_const_int_operand" predicate that will > > allow only const_int operands, and will reject negative endbr (and >

Re: [PATCH] i386: Extend *cmp_minus_1 optimizations also to plus with CONST_INT [PR120360]

2025-05-21 Thread Uros Bizjak
On Wed, May 21, 2025 at 9:44 AM Jakub Jelinek wrote: > > Hi! > > As mentioned by Linus, we can't optimize comparison of otherwise unused > result of plus with CONST_INT second operand, compared against zero. > This can be done using just cmp instruction with negated constant and say > js/jns/je/jn

Re: [PATCH] [testsuite] add missing require vect_early_break_hw for vect-tsvc

2025-05-19 Thread Uros Bizjak
LGTM for the whole series. Thanks, Uros. On Tue, May 20, 2025 at 6:17 AM Alexandre Oliva wrote: > > > Some tsvc tests add vect_early_break options without requiring the > feature to be available. Add the requirements. > > Regstrapped on x86_64-linux-gnu. Also tested with gcc-14 on aarch64-, >

Re: [PATCH] x86: Enable separate shrink wrapping

2025-05-13 Thread Uros Bizjak
On Tue, May 13, 2025 at 8:15 AM Cui, Lili wrote: > > From: Lili Cui > > Hi, > > This patch is to enale separate shrink wrapping for x86. > > Bootstrapped & regtested on x86-64-pc-linux-gnu. > > Ok for trunk? Unfortunately, the patched compiler fails to boot the latest linux kernel. Uros. Uro

Re: [PATCH] x86: Enable separate shrink wrapping

2025-05-13 Thread Uros Bizjak
On Tue, May 13, 2025 at 8:15 AM Cui, Lili wrote: > > From: Lili Cui > > Hi, > > This patch is to enale separate shrink wrapping for x86. > > Bootstrapped & regtested on x86-64-pc-linux-gnu. > > Ok for trunk? > > > This commit implements the target macros (TARGET_SHRINK_WRAP_*) that > enable separ

Re: [PATCH] x86: Remove df_insn_rescan after emit_insn_*

2025-05-11 Thread Uros Bizjak
On Mon, May 12, 2025 at 8:19 AM H.J. Lu wrote: > > Since df_insn_rescan has been called by emit_insn_*, there is no need > to call it after calling emit_insn_*. Remove its unnecessary usages. > > PR target/120228 > * config/i386/i386-features.cc (ix86_place_single_vector_set): > Remove df_insn_re

[pushed]: i386: Do not use explicit operands for MOVS instructions [PR120019]

2025-05-05 Thread Uros Bizjak
Some assemblers do not support MOVS instructions with explicit operands. Emit instruction with implicit operands, but prefix the instruction with a segment override prefix if the memory operand refers to ADDR_SPACE_SEG_FS or ADDR_SPACE_SEG_GS named address space. PR target/120019 gcc/ChangeLo

Re: [PATCH v4] libstdc++: Implement C++26 features (P2546R5)

2025-05-05 Thread Uros Bizjak
On Thu, May 1, 2025 at 12:59 PM Jonathan Wakely wrote: > > This includes the P2810R4 (is_debugger_present is_replaceable) changes, > allowing std::is_debugger_present to be replaced by the program. > > It would be good to provide a macOS definition of is_debugger_present as > per https://developer

Re: [PATCH] x86-64: Don't expand UNSPEC_TLS_LD_BASE to a call

2025-05-02 Thread Uros Bizjak
On Fri, May 2, 2025 at 2:33 AM H.J. Lu wrote: > > On Wed, Apr 30, 2025 at 7:40 PM Uros Bizjak wrote: > > > > On Tue, Apr 29, 2025 at 12:22 PM H.J. Lu wrote: > > > > > > On Tue, Apr 29, 2025 at 5:30 PM Uros Bizjak wrote: > > > > > &

Re: [PATCH RFA] i386: -Wabi false positive with indirect call

2025-05-02 Thread Uros Bizjak
On Thu, May 1, 2025 at 10:46 PM Jason Merrill wrote: > > Tested x86_64-pc-linux-gnu, OK for trunk? > > -- 8< -- > > This warning relies on the TRANSLATION_UNIT_WARN_EMPTY_P flag (set in > cxx_init_decl_processing) to decide whether we want to warn about the GCC 8 > empty class parameter passing fi

Re: [PATCH] x86: Remove BREG from ix86_class_likely_spilled_p

2025-05-01 Thread Uros Bizjak
On Thu, May 1, 2025 at 1:21 PM Richard Sandiford wrote: > > Uros Bizjak writes: > > On Wed, Apr 30, 2025 at 11:31 PM H.J. Lu wrote: > >> > >> On Wed, Apr 30, 2025 at 7:37 PM Uros Bizjak wrote: > >> > > >> > On Tue, Apr 29, 2025 at 11:40 PM

Re: [PATCH] x86: Remove BREG from ix86_class_likely_spilled_p

2025-05-01 Thread Uros Bizjak
On Thu, May 1, 2025 at 9:10 AM H.J. Lu wrote: > > On Thu, May 1, 2025 at 2:56 PM Uros Bizjak wrote: > > > > On Wed, Apr 30, 2025 at 11:31 PM H.J. Lu wrote: > > > > > > On Wed, Apr 30, 2025 at 7:37 PM Uros Bizjak wrote: > > > > > &g

Re: [PATCH] x86: Update TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P

2025-05-01 Thread Uros Bizjak
On Thu, May 1, 2025 at 12:49 AM H.J. Lu wrote: > > On Wed, Apr 30, 2025 at 7:48 PM Uros Bizjak wrote: > > > > On Tue, Apr 29, 2025 at 11:40 PM H.J. Lu wrote: > > > > > > SMALL_REGISTER_CLASSES was added by > > > > > > commit c98f874233

Re: [PATCH] x86: Remove SSE_FIRST_REG from ix86_class_likely_spilled_p

2025-05-01 Thread Uros Bizjak
On Wed, Apr 30, 2025 at 11:43 PM H.J. Lu wrote: > > On Wed, Apr 30, 2025 at 8:12 PM Uros Bizjak wrote: > > > > On Tue, Apr 29, 2025 at 11:40 PM H.J. Lu wrote: > > > > > > SSE_FIRST_REG was added to CLASS_LIKELY_SPILLED_P, which became > > > TARGET_

Re: [PATCH] x86: Remove BREG from ix86_class_likely_spilled_p

2025-04-30 Thread Uros Bizjak
On Wed, Apr 30, 2025 at 11:31 PM H.J. Lu wrote: > > On Wed, Apr 30, 2025 at 7:37 PM Uros Bizjak wrote: > > > > On Tue, Apr 29, 2025 at 11:40 PM H.J. Lu wrote: > > > > > > AREG, DREG, CREG and AD_REGS are kept in ix86_class_likely_spilled_p to > &

Re: [PATCH] x86: Remove SSE_FIRST_REG from ix86_class_likely_spilled_p

2025-04-30 Thread Uros Bizjak
On Tue, Apr 29, 2025 at 11:40 PM H.J. Lu wrote: > > SSE_FIRST_REG was added to CLASS_LIKELY_SPILLED_P, which became > TARGET_CLASS_LIKELY_SPILLED_P, for > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40470 > > Since RA has been improved and xmm0 is a commonly used register, remove > SSE_FIRST_RE

Re: [PATCH] x86: Update TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P

2025-04-30 Thread Uros Bizjak
On Tue, Apr 29, 2025 at 11:40 PM H.J. Lu wrote: > > SMALL_REGISTER_CLASSES was added by > > commit c98f874233428d7e6ba83def7842fd703ac0ddf1 > Author: James Van Artsdalen > Date: Sun Feb 9 13:28:48 1992 + > > Initial revision > > which became TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P. It

Re: [PATCH] x86-64: Don't expand UNSPEC_TLS_LD_BASE to a call

2025-04-30 Thread Uros Bizjak
On Tue, Apr 29, 2025 at 12:22 PM H.J. Lu wrote: > > On Tue, Apr 29, 2025 at 5:30 PM Uros Bizjak wrote: > > > > On Tue, Apr 29, 2025 at 9:56 AM H.J. Lu wrote: > > > > > > Don't expand UNSPEC_TLS_LD_BASE to a call so that the RTL local copy >

Re: [PATCH] x86: Remove BREG from ix86_class_likely_spilled_p

2025-04-30 Thread Uros Bizjak
On Tue, Apr 29, 2025 at 11:40 PM H.J. Lu wrote: > > AREG, DREG, CREG and AD_REGS are kept in ix86_class_likely_spilled_p to > avoid the following regressions with > > $ make check RUNTESTFLAGS="--target_board='unix{-m32,}'" > > FAIL: gcc.dg/pr105911.c (internal compiler error: in lra_split_hard_re

[pushed] i386: Disable string insn from non-default AS for Pmode != word_mode [PR111657]

2025-04-29 Thread Uros Bizjak
0x67 prefix is applied before segment register. That is in rep movsq %gs:(%esi), (%edi) the address is %gs + %esi. In case Pmode != word_mode (x32 with a default -maddress-mode=short) instructions should not allow segment override prefixes. Also, remove explicit addr32 prefix from asm templa

Re: [pushed] i386: Allow string instructions from non-default address space [PR111657]

2025-04-29 Thread Uros Bizjak
On Tue, Apr 29, 2025 at 12:41 PM H.J. Lu wrote: > > On Tue, Apr 29, 2025 at 5:52 PM Uros Bizjak wrote: > > > > MOVS instructions allow segment override of their source operand, e.g.: > > > > rep movsq %gs:(%rsi), (%rdi) > > > > where %rsi is th

[pushed] i386: Allow string instructions from non-default address space [PR111657]

2025-04-29 Thread Uros Bizjak
MOVS instructions allow segment override of their source operand, e.g.: rep movsq %gs:(%rsi), (%rdi) where %rsi is the address of the source location (with %gs segment override) and %rdi is the address of the destination location. The testcase improves from (-O2 -mno-sse -mtune=generic):

Re: [PATCH] x86-64: Don't expand UNSPEC_TLS_LD_BASE to a call

2025-04-29 Thread Uros Bizjak
On Tue, Apr 29, 2025 at 9:56 AM H.J. Lu wrote: > > Don't expand UNSPEC_TLS_LD_BASE to a call so that the RTL local copy > propagation pass can eliminate multiple __tls_get_addr calls. __tls_get_addr needs to be called with 16-byte aligned stack, I don't think the compiler will correctly handle re

  1   2   3   4   5   6   7   8   9   10   >