Re: [PATCH] i386; Add indirect_return function attribute

2018-07-03 Thread H.J. Lu
On Fri, Jun 8, 2018 at 3:27 AM, H.J. Lu wrote: > On x86, swapcontext may return via indirect branch when shadow stack > is enabled. To support code instrumentation of control-flow transfers > with -fcf-protection, add indirect_return function attribute to inform > compiler that a

Re: [PATCH] i386; Add indirect_return function attribute

2018-07-03 Thread H.J. Lu
On Tue, Jul 3, 2018 at 9:12 AM, Uros Bizjak wrote: > On Tue, Jul 3, 2018 at 5:32 PM, H.J. Lu wrote: >> On Fri, Jun 8, 2018 at 3:27 AM, H.J. Lu wrote: >>> On x86, swapcontext may return via indirect branch when shadow stack >>> is enabled. To support code inst

[PATCH] x86: Tune Skylake, Cannonlake and Icelake as Haswell

2018-07-12 Thread H.J. Lu
mance unchanged on Haswell. OK for trunk? Thanks. H.J. --- gcc/ 2018-07-12 H.J. Lu Sunil K Pandey PR target/84413 * config/i386/i386.c (m_HASWELL): Add PROCESSOR_SKYLAKE, PROCESSOR_SKYLAKE_AVX512, PROCESSOR_CANNONLAKE, PROCESSOR_ICELAKE_CLIEN

Re: [PATCH] x86: Tune Skylake, Cannonlake and Icelake as Haswell

2018-07-13 Thread H.J. Lu
On Fri, Jul 13, 2018 at 08:53:02AM +0200, Uros Bizjak wrote: > On Thu, Jul 12, 2018 at 9:57 PM, H.J. Lu wrote: > > > r259399, which added PROCESSOR_SKYLAKE, disabled many x86 optimizations > > which are enabled by PROCESSOR_HASWELL. As the result, -mtune=skylake > > g

Re: [PATCH] x86: Tune Skylake, Cannonlake and Icelake as Haswell

2018-07-13 Thread H.J. Lu
e have also noticed that benchmarks on skylake are not good compared to > haswell, this nicely explains it. I think this is -march=native regression > compared to GCC versions that did not suppored better CPUs than Haswell. So > it > would be nice to backport it. Yes, we shoul

Re: [PATCH] x86: Tune Skylake, Cannonlake and Icelake as Haswell

2018-07-13 Thread H.J. Lu
ed better CPUs than Haswell. >> > So it >> > would be nice to backport it. >> >> Yes, we should. Here is the patch to backport to GCC 8. OK for GCC 8 after >> it has been checked into trunk? > > OK, > Honza >> >> Thanks. >> >> -- >&

Re: [PATCH] x86: Tune Skylake, Cannonlake and Icelake as Haswell

2018-07-14 Thread H.J. Lu
On Sat, Jul 14, 2018 at 06:09:47PM +0200, Gerald Pfeifer wrote: > On Fri, 13 Jul 2018, H.J. Lu wrote: > > I will do the same for GCC8 backport. > > Can you please add a note to gcc-8/changes.html? This seems big > enough to warrant a note in a part for GCC 8.2. > > (A

[PATCH 2/3] i386: Change indirect_return to function type attribute

2018-07-18 Thread H.J. Lu
In struct ucontext; typedef struct ucontext ucontext_t; extern int (*bar) (ucontext_t *__restrict __oucp, const ucontext_t *__restrict __ucp) __attribute__((__indirect_return__)); extern int res; void foo (ucontext_t *oucp, ucontext_t *ucp) { res = bar (oucp, ucp); } bar

[PATCH] Call REAL(swapcontext) with indirect_return attribute on x86

2018-07-18 Thread H.J. Lu
asan/asan_interceptors.cc has ... int res = REAL(swapcontext)(oucp, ucp); ... REAL(swapcontext) is a function pointer to swapcontext in libc. Since swapcontext may return via indirect branch on x86 when shadow stack is enabled, we need to call REAL(swapcontext) with indirect_return attribute o

Re: [PATCH] Call REAL(swapcontext) with indirect_return attribute on x86

2018-07-18 Thread H.J. Lu
in upstream. That is why I opened an LLVM bug. > --kcc > > On Wed, Jul 18, 2018 at 8:37 AM H.J. Lu wrote: >> >> asan/asan_interceptors.cc has >> >> ... >> int res = REAL(swapcontext)(oucp, ucp); >> ... >> >> REAL(swapcontext) is a funct

Re: [PATCH] Call REAL(swapcontext) with indirect_return attribute on x86

2018-07-18 Thread H.J. Lu
On Wed, Jul 18, 2018 at 11:45 AM, Kostya Serebryany wrote: > On Wed, Jul 18, 2018 at 11:40 AM H.J. Lu wrote: >> >> On Wed, Jul 18, 2018 at 11:18 AM, Kostya Serebryany wrote: >> > What's ENDBR and do we really need to have it in compiler-rt? >> >> Wh

[PATCH] i386: Define __HAVE_INDIRECT_RETURN_ATTRIBUTE__

2018-07-19 Thread H.J. Lu
On Thu, Jul 19, 2018 at 10:35:27AM +0200, Richard Biener wrote: > On Wed, Jul 18, 2018 at 5:33 PM H.J. Lu wrote: > > > > In > > > > struct ucontext; > > typedef struct ucontext ucontext_t; > > > > extern int (*bar) (ucontext_t *__restrict __

Re: [PATCH] i386: Define __HAVE_INDIRECT_RETURN_ATTRIBUTE__

2018-07-19 Thread H.J. Lu
On Thu, Jul 19, 2018 at 01:39:04PM +0200, Florian Weimer wrote: > On 07/19/2018 01:33 PM, Jakub Jelinek wrote: > > On Thu, Jul 19, 2018 at 04:21:26AM -0700, H.J. Lu wrote: > > > The new indirect_return attribute is intended to mark swapcontext in > >

Re: [PATCH] i386: Define __HAVE_INDIRECT_RETURN_ATTRIBUTE__

2018-07-19 Thread H.J. Lu
On Thu, Jul 19, 2018 at 4:56 AM, Jakub Jelinek wrote: > On Thu, Jul 19, 2018 at 01:54:46PM +0200, Florian Weimer wrote: >> On 07/19/2018 01:48 PM, H.J. Lu wrote: >> > Both __has_attribute (indirect_return) and __has_attribute >> > (__indirect_return__) >> > wo

Re: [PATCH] Call REAL(swapcontext) with indirect_return attribute on x86

2018-07-19 Thread H.J. Lu
On Wed, Jul 18, 2018 at 12:34:28PM -0700, Kostya Serebryany wrote: > On Wed, Jul 18, 2018 at 12:29 PM H.J. Lu wrote: > > > > On Wed, Jul 18, 2018 at 11:45 AM, Kostya Serebryany wrote: > > > On Wed, Jul 18, 2018 at 11:40 AM H.J. Lu wrote: > > >> > > &

[PATCH] i386: Remove _Unwind_Frames_Increment

2018-07-20 Thread H.J. Lu
Tested on CET SDV using the CET kernel on cet branch at: https://github.com/yyu168/linux_cet/tree/cet OK for trunk and GCC 8 branch? Thanks. H.J. --- The CET kernel has been changed to place a restore token on shadow stack for signal handler to enhance security. It is usually transparent to u

[PATCH] libsanitizer: Mark REAL(swapcontext) with indirect_return attribute on x86

2018-07-20 Thread H.J. Lu
Cherry-pick compiler-rt revision 337603: When shadow stack from Intel CET is enabled, the first instruction of all indirect branch targets must be a special instruction, ENDBR. lib/asan/asan_interceptors.cc has ... int res = REAL(swapcontext)(oucp, ucp); ... REAL(swapcontext) is a function po

Re: [PATCH] specify large command line option arguments (PR 82063)

2018-07-21 Thread H.J. Lu
On Fri, Jul 20, 2018 at 1:57 PM, Martin Sebor wrote: > On 07/19/2018 04:31 PM, Jeff Law wrote: >> >> On 06/24/2018 03:05 PM, Martin Sebor wrote: >>> >>> Storing integer command line option arguments in type int >>> limits options such as -Wlarger-than= or -Walloca-larger-than >>> to at most INT_MA

V3 [PATCH] C/C++: Add -Waddress-of-packed-member

2018-07-23 Thread H.J. Lu
ned warn_for_pointer_of_packed_member and warn_for_address_of_packed_member into warn_for_address_or_pointer_of_packed_member. Tested on Linux/x86-64 and Linux/i686. OK for trunk. Thanks. -- H.J. From 2ddae2d57d2875e80c9186b281edfabfddb42e86 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Fri, 12 Jan 2018 21:12:05 -0800 Sub

PING [PATCH] libsanitizer: Mark REAL(swapcontext) with indirect_return attribute on x86

2018-07-26 Thread H.J. Lu
On Fri, Jul 20, 2018 at 1:11 PM, H.J. Lu wrote: > Cherry-pick compiler-rt revision 337603: > > When shadow stack from Intel CET is enabled, the first instruction of all > indirect branch targets must be a special instruction, ENDBR. > > lib/asan/asan_interceptors.cc has > >

PING [PATCH] i386: Remove _Unwind_Frames_Increment

2018-07-26 Thread H.J. Lu
On Fri, Jul 20, 2018 at 11:15 AM, H.J. Lu wrote: > Tested on CET SDV using the CET kernel on cet branch at: > > https://github.com/yyu168/linux_cet/tree/cet > > OK for trunk and GCC 8 branch? > > Thanks. > > > H.J. > --- > The CET kernel has been changed to p

Re: [PATCH] combine: Allow combining two insns to two insns

2018-07-31 Thread H.J. Lu
On Wed, Jul 25, 2018 at 1:28 AM, Richard Biener wrote: > On Tue, Jul 24, 2018 at 7:18 PM Segher Boessenkool > wrote: >> >> This patch allows combine to combine two insns into two. This helps >> in many cases, by reducing instruction path length, and also allowing >> further combinations to happe

Re: [PATCH 01/11] Add __builtin_speculation_safe_value

2018-07-31 Thread H.J. Lu
On Mon, Jul 30, 2018 at 6:16 AM, Richard Biener wrote: > On Fri, 27 Jul 2018, Richard Earnshaw wrote: > >> >> This patch defines a new intrinsic function >> __builtin_speculation_safe_value. A generic default implementation is >> defined which will attempt to use the backend pattern >> "speculati

Re: [PATCH 10/11] x86 - add speculation_barrier pattern

2018-07-31 Thread H.J. Lu
On Sat, Jul 28, 2018 at 1:25 AM, Uros Bizjak wrote: > On Fri, Jul 27, 2018 at 11:37 AM, Richard Earnshaw > wrote: >> >> This patch adds a speculation barrier for x86, based on my >> understanding of the required mitigation for that CPU, which is to use >> an lfence instruction. >> >> This patch n

Re: [PR 83141] Prevent SRA from removing type changing assignment

2018-07-31 Thread H.J. Lu
On Tue, Dec 5, 2017 at 4:00 AM, Martin Jambor wrote: > On Tue, Dec 05 2017, Martin Jambor wrote: >> On Tue, Dec 05 2017, Martin Jambor wrote: >> Hi, >> >>> Hi, >>> >>> this is a followup to Richi's >>> https://gcc.gnu.org/ml/gcc-patches/2017-11/msg02396.html to fix PR >>> 83141. The basic idea is

Re: RFC/A: Add a targetm.vectorize.related_mode hook

2019-10-23 Thread H.J. Lu
On Wed, Oct 23, 2019 at 4:51 AM Richard Sandiford wrote: > > Richard Biener writes: > > On Wed, Oct 23, 2019 at 1:00 PM Richard Sandiford > > wrote: > >> > >> This patch is the first of a series that tries to remove two > >> assumptions: > >> > >> (1) that all vectors involved in vectorisation m

Re: RFC/A: Add a targetm.vectorize.related_mode hook

2019-10-24 Thread H.J. Lu
On Thu, Oct 24, 2019 at 12:56 AM Richard Sandiford wrote: > > "H.J. Lu" writes: > > On Wed, Oct 23, 2019 at 4:51 AM Richard Sandiford > > wrote: > >> > >> Richard Biener writes: > >> > On Wed, Oct 23, 2019 at 1:00 PM Richard Sandiford &

Re: [PR47785] COLLECT_AS_OPTIONS

2019-10-29 Thread H.J. Lu
On Sun, Oct 27, 2019 at 6:33 PM Kugan Vivekanandarajah wrote: > > Hi Richard, > > Thanks for the review. > > On Wed, 23 Oct 2019 at 23:07, Richard Biener > wrote: > > > > On Mon, Oct 21, 2019 at 10:04 AM Kugan Vivekanandarajah > > wrote: > > > > > > Hi Richard, > > > > > > Thanks for the pointe

Re: [PR47785] COLLECT_AS_OPTIONS

2019-11-01 Thread H.J. Lu
On Thu, Oct 31, 2019 at 6:33 PM Kugan Vivekanandarajah wrote: > > On Wed, 30 Oct 2019 at 03:11, H.J. Lu wrote: > > > > On Sun, Oct 27, 2019 at 6:33 PM Kugan Vivekanandarajah > > wrote: > > > > > > Hi Richard, > > > > > > Thanks for t

Re: [PR47785] COLLECT_AS_OPTIONS

2019-11-04 Thread H.J. Lu
On Sun, Nov 3, 2019 at 6:45 PM Kugan Vivekanandarajah wrote: > > Thanks for the reviews. > > > On Sat, 2 Nov 2019 at 02:49, H.J. Lu wrote: > > > > On Thu, Oct 31, 2019 at 6:33 PM Kugan Vivekanandarajah > > wrote: > > > > > > On Wed, 30 Oct 20

Re: [PATCH] Set AVX128_OPTIMAL for all avx targets.

2019-11-12 Thread H.J. Lu
On Tue, Nov 12, 2019 at 2:48 AM Hongtao Liu wrote: > > On Tue, Nov 12, 2019 at 4:41 PM Richard Biener > wrote: > > > > On Tue, Nov 12, 2019 at 9:29 AM Hongtao Liu wrote: > > > > > > On Tue, Nov 12, 2019 at 4:19 PM Richard Biener > > > wrote: > > > > > > > > On Tue, Nov 12, 2019 at 8:36 AM Hongt

Re: [PR47785] COLLECT_AS_OPTIONS

2020-01-17 Thread H.J. Lu
kanandarajah > > > wrote: > > > > > > > > Hi, > > > > Thanks for the review. > > > > > > > > On Tue, 5 Nov 2019 at 03:57, H.J. Lu wrote: > > > > > > > > > > On Sun, Nov 3, 2019 at 6:45 PM Kugan Vivekananda

[PATCH] PR target/93319: x32: Add x32 support to -mtls-dialect=gnu2

2020-01-19 Thread H.J. Lu
To add x32 support to -mtls-dialect=gnu2, we need to replace DI with P in GNU2 TLS patterns. Since thread pointer is in ptr_mode, PLUS in GNU2 TLS address computation must be done in ptr_mode to support -maddress-mode=long. Also drop the "q" suffix from lea to support both "lea foo@TLSDESC(%rip),

Re: New repository location

2020-01-19 Thread H.J. Lu
On Sun, Jan 19, 2020 at 6:33 AM Bill Schmidt wrote: > > Question: Is the new gcc git repository at gcc.gnu.org/git/gcc.git > using the same location as the earlier git mirror did? I'm curious > whether our repository on pike is still syncing with the new master, or > whether we need to make some

Re: [PATCH] PR target/93319: x32: Add x32 support to -mtls-dialect=gnu2

2020-01-19 Thread H.J. Lu
On Sun, Jan 19, 2020 at 9:48 AM Uros Bizjak wrote: > > On Sun, Jan 19, 2020 at 6:43 PM Uros Bizjak wrote: > > > > On Sun, Jan 19, 2020 at 2:58 PM H.J. Lu wrote: > > > > > > To add x32 support to -mtls-dialect=gnu2, we need to replace DI with > > >

Re: [PATCH] PR target/93319: x32: Add x32 support to -mtls-dialect=gnu2

2020-01-19 Thread H.J. Lu
On Sun, Jan 19, 2020 at 12:01 PM Uros Bizjak wrote: > > On Sun, Jan 19, 2020 at 7:07 PM H.J. Lu wrote: > > > > On Sun, Jan 19, 2020 at 9:48 AM Uros Bizjak wrote: > > > > > > On Sun, Jan 19, 2020 at 6:43 PM Uros Bizjak wrote: > > > > > &

Re: [PATCH] PR target/93319: x32: Add x32 support to -mtls-dialect=gnu2

2020-01-19 Thread H.J. Lu
On Sun, Jan 19, 2020 at 12:16 PM Uros Bizjak wrote: > > On Sun, Jan 19, 2020 at 9:07 PM H.J. Lu wrote: > > > > On Sun, Jan 19, 2020 at 12:01 PM Uros Bizjak wrote: > > > > > > On Sun, Jan 19, 2020 at 7:07 PM H.J. Lu wrote: > > > > > >

Re: [PATCH] Make target_clones resolver fn static.

2020-01-20 Thread H.J. Lu
On Mon, Jan 20, 2020 at 2:25 AM Richard Biener wrote: > > On Fri, Jan 17, 2020 at 10:25 AM Martin Liška wrote: > > > > Hi. > > > > The patch removes need to have a gnu_indirect_function global > > symbol. That aligns the code with what ppc64 target does. > > > > Patch can bootstrap on x86_64-linu

Re: [PATCH] PR target/93319: x32: Add x32 support to -mtls-dialect=gnu2

2020-01-20 Thread H.J. Lu
On Sun, Jan 19, 2020 at 11:53 PM Uros Bizjak wrote: > > On Sun, Jan 19, 2020 at 10:00 PM H.J. Lu wrote: > > > > On Sun, Jan 19, 2020 at 12:16 PM Uros Bizjak wrote: > > > > > > On Sun, Jan 19, 2020 at 9:07 PM H.J. Lu wrote: > > > > > > &

Re: [PATCH] Make target_clones resolver fn static.

2020-01-20 Thread H.J. Lu
On Mon, Jan 20, 2020 at 5:36 AM Alexander Monakov wrote: > > > > On Mon, 20 Jan 2020, H.J. Lu wrote: > > We can that only if function is static: > > > [ship asm] > > > > In this case, foo must be global. > > H.J., can you rephrase more clearly? You

Re: [PATCH] Make target_clones resolver fn static.

2020-01-20 Thread H.J. Lu
On Mon, Jan 20, 2020 at 6:16 AM Alexander Monakov wrote: > > > > On Mon, 20 Jan 2020, H.J. Lu wrote: > > For, > > > > --- > > __attribute__((target_clones("avx","default"))) > > int > > foo () > > { &

Re: [PATCH] Make target_clones resolver fn static.

2020-01-20 Thread H.J. Lu
On Mon, Jan 20, 2020 at 6:41 AM Alexander Monakov wrote: > > > > On Mon, 20 Jan 2020, H.J. Lu wrote: > > > > Bare IFUNC's don't seem to have this restriction. Why do we want to > > > constrain target clones this way? > > > > > > > fo

Re: [PATCH] PR target/93319: x32: Add x32 support to -mtls-dialect=gnu2

2020-01-20 Thread H.J. Lu
On Mon, Jan 20, 2020 at 5:24 AM H.J. Lu wrote: > > On Sun, Jan 19, 2020 at 11:53 PM Uros Bizjak wrote: > > > > On Sun, Jan 19, 2020 at 10:00 PM H.J. Lu wrote: > > > > > > On Sun, Jan 19, 2020 at 12:16 PM Uros Bizjak wrote: > > > > > &

Re: [PATCH] PR target/93319: x32: Add x32 support to -mtls-dialect=gnu2

2020-01-21 Thread H.J. Lu
On Tue, Jan 21, 2020 at 2:29 AM Uros Bizjak wrote: > > On Tue, Jan 21, 2020 at 9:47 AM Uros Bizjak wrote: > > > > On Mon, Jan 20, 2020 at 10:46 PM H.J. Lu wrote: > > > > > > > OK. Let's go with this version, but please investigate if we need to

[PATCH] i386: Don't use ix86_tune_ctrl_string in parse_mtune_ctrl_str

2020-01-27 Thread H.J. Lu
There are static void parse_mtune_ctrl_str (bool dump) { if (!ix86_tune_ctrl_string) return; parse_mtune_ctrl_str is only called from set_ix86_tune_features, which is only called from ix86_function_specific_restore and ix86_option_override_internal. parse_mtune_ctrl_str shouldn't use ix86_

[PATCH] i386: Disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX

2020-01-27 Thread H.J. Lu
movaps/movups is one byte shorter than movdaq/movdqu. But it isn't the case for AVX nor AVX512. We should disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX. gcc/ PR target/91461 * config/i386/i386.h (TARGET_SSE_TYPELESS_STORES): Disable for TARGET_AVX. * config/i

PING^5: [PATCH] i386: Properly encode xmm16-xmm31/ymm16-ymm31 for vector move

2020-01-27 Thread H.J. Lu
On Mon, Jul 8, 2019 at 8:19 AM H.J. Lu wrote: > > On Tue, Jun 18, 2019 at 8:59 AM H.J. Lu wrote: > > > > On Fri, May 31, 2019 at 10:38 AM H.J. Lu wrote: > > > > > > On Tue, May 21, 2019 at 2:43 PM H.J. Lu wrote: > > > > > &

Re: [PATCH] i386: Disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX

2020-01-27 Thread H.J. Lu
On Mon, Jan 27, 2020 at 12:26 PM Uros Bizjak wrote: > > On Mon, Jan 27, 2020 at 7:23 PM H.J. Lu wrote: > > > > movaps/movups is one byte shorter than movdaq/movdqu. But it isn't the > > case for AVX nor AVX512. We should disable TARGET_SSE_TYPELESS_STORES &g

Re: [PATCH] i386: Disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX

2020-01-27 Thread H.J. Lu
On Mon, Jan 27, 2020 at 2:17 PM H.J. Lu wrote: > > On Mon, Jan 27, 2020 at 12:26 PM Uros Bizjak wrote: > > > > On Mon, Jan 27, 2020 at 7:23 PM H.J. Lu wrote: > > > > > > movaps/movups is one byte shorter than movdaq/movdqu. But it isn't the > >

Re: [PATCH] i386: Disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX

2020-01-28 Thread H.J. Lu
On Mon, Jan 27, 2020 at 11:04 PM Uros Bizjak wrote: > > On Mon, Jan 27, 2020 at 11:17 PM H.J. Lu wrote: > > > > On Mon, Jan 27, 2020 at 12:26 PM Uros Bizjak wrote: > > > > > > On Mon, Jan 27, 2020 at 7:23 PM H.J. Lu wrote: > > > > > > &g

Re: [PATCH] i386: Disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX

2020-01-28 Thread H.J. Lu
On Tue, Jan 28, 2020 at 6:45 AM Uros Bizjak wrote: > > On Tue, Jan 28, 2020 at 3:32 PM H.J. Lu wrote: > > > > On Mon, Jan 27, 2020 at 11:04 PM Uros Bizjak wrote: > > > > > > On Mon, Jan 27, 2020 at 11:17 PM H.J. Lu wrote: > > > > > > &

[PATCH] i386: Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES

2020-01-28 Thread H.J. Lu
On Tue, Jan 28, 2020 at 9:12 AM Uros Bizjak wrote: > > On Tue, Jan 28, 2020 at 4:34 PM H.J. Lu wrote: > > > > You could move > > > > > > (match_test "TARGET_AVX") > > > (const_string "TI") > > > > > > up to bypa

Re: [PATCH] i386: Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES

2020-01-28 Thread H.J. Lu
On Tue, Jan 28, 2020 at 10:04 AM Uros Bizjak wrote: > > On Tue, Jan 28, 2020 at 6:51 PM H.J. Lu wrote: > > > > On Tue, Jan 28, 2020 at 9:12 AM Uros Bizjak wrote: > > > > > > On Tue, Jan 28, 2020 at 4:34 PM H.J. Lu wrote: > > > > > &

Re: [PATCH] i386: Prefer TARGET_AVX over TARGET_SSE_TYPELESS_STORES

2020-01-28 Thread H.J. Lu
On Tue, Jan 28, 2020 at 10:58 AM Jakub Jelinek wrote: > > On Tue, Jan 28, 2020 at 10:20:36AM -0800, H.J. Lu wrote: > > From 66c534dedc7a9a632aa38c32e3f7c251b8f2c778 Mon Sep 17 00:00:00 2001 > > From: "H.J. Lu" > > Date: Mon, 27 Jan 2020 09:35:11 -0800 > >

[PATCH] i386: Define TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY

2020-02-03 Thread H.J. Lu
-- H.J. From 5363c0289e3525139939bb678deeda98d06b2556 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Mon, 3 Feb 2020 10:22:57 -0800 Subject: [PATCH] i386: Define TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY Define TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY to make sure that the ENDBR ar

Re: [PATCH] i386: Define TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY

2020-02-03 Thread H.J. Lu
On Mon, Feb 3, 2020 at 10:35 AM H.J. Lu wrote: > > Define TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY to make sure that the > ENDBR are emitted before the patch area. When -mfentry -pg is also used > together, there should be no ENDBR before "call __fentry__". > >

Re: [PATCH] i386: Define TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY

2020-02-03 Thread H.J. Lu
On Mon, Feb 3, 2020 at 4:02 PM H.J. Lu wrote: > > On Mon, Feb 3, 2020 at 10:35 AM H.J. Lu wrote: > > > > Define TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY to make sure that the > > ENDBR are emitted before the patch area. When -mfentry -pg is also used > > toge

[PATCH] x86: Add UNSPECV_PATCHABLE_AREA

2020-02-04 Thread H.J. Lu
On Mon, Feb 03, 2020 at 06:10:49PM -0800, H.J. Lu wrote: > On Mon, Feb 3, 2020 at 4:02 PM H.J. Lu wrote: > > > > On Mon, Feb 3, 2020 at 10:35 AM H.J. Lu wrote: > > > > > > Define TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY to make sure that the > > > ENDB

[PATCH 0/3] Update -fpatchable-function-entry implementation

2020-02-05 Thread H.J. Lu
table, I can combine patch 1 and patch 3 into a single patch. H.J. Lu (3): x86: Add UNSPECV_PATCHABLE_AREA Add patch_area_size and patch_area_entry to cfun x86: Simplify UNSPECV_PATCHABLE_AREA generation gcc/config/i386/i386-features.c | 130 -- gcc/config/i386

[PATCH 1/3] x86: Add UNSPECV_PATCHABLE_AREA

2020-02-05 Thread H.J. Lu
Currently patchable area is at the wrong place. It is placed immediately after function label, before both .cfi_startproc and ENDBR. This patch adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and changes ENDBR insertion pass to also insert a dummy patchable area. TARGET_ASM_PRIN

[PATCH 2/3] Add patch_area_size and patch_area_entry to cfun

2020-02-05 Thread H.J. Lu
Currently patchable area is at the wrong place. It is placed immediately after function label and before .cfi_startproc. A backend should be able to add a pseudo patchable area instruction durectly into RTL. This patch adds patch_area_size and patch_area_entry to cfun so that the patchable area

[PATCH 3/3] x86: Simplify UNSPECV_PATCHABLE_AREA generation

2020-02-05 Thread H.J. Lu
Since patch_area_size and patch_area_entry have been added to cfun, we can use them to directly insert the pseudo UNSPECV_PATCHABLE_AREA instruction. PR target/93492 * config/i386/i386-features.c (rest_of_insert_endbr_and_patchable_area): Change need_patchable_area

[PATCH] x86-64: Pass aggregates with only float/double in GPRs for MS_ABI

2020-02-05 Thread H.J. Lu
test. * gcc.target/i386/pr85667-7.c: Likewise. * gcc.target/i386/pr85667-8.c: Likewise. * gcc.target/i386/pr85667-9.c: Likewise. -- H.J. From e561fd8fcb46b8d8e40942c077e26ce120832747 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Wed, 5 Feb 2020 09:49:56 -0800 Subject: [PATCH] x86-64: Pass

[PATCH] Add patch_area_size and patch_area_entry to crtl

2020-02-05 Thread H.J. Lu
On Wed, Feb 5, 2020 at 9:00 AM Richard Sandiford wrote: > > "H.J. Lu" writes: > > Currently patchable area is at the wrong place. > > Agreed :-) > > > It is placed immediately > > after function label and before .cfi_startproc. A backend should

Re: [PATCH] Add patch_area_size and patch_area_entry to crtl

2020-02-05 Thread H.J. Lu
On Wed, Feb 5, 2020 at 12:20 PM H.J. Lu wrote: > > On Wed, Feb 5, 2020 at 9:00 AM Richard Sandiford > wrote: > > > > "H.J. Lu" writes: > > > Currently patchable area is at the wrong place. > > > > Agreed :-) > > > > > It is plac

Re: [PATCH] Add patch_area_size and patch_area_entry to crtl

2020-02-05 Thread H.J. Lu
On Wed, Feb 5, 2020 at 2:37 PM Marek Polacek wrote: > > On Wed, Feb 05, 2020 at 02:24:48PM -0800, H.J. Lu wrote: > > On Wed, Feb 5, 2020 at 12:20 PM H.J. Lu wrote: > > > > > > On Wed, Feb 5, 2020 at 9:00 AM Richard Sandiford > > > wrote: > > >

Re: [PATCH] Add patch_area_size and patch_area_entry to crtl

2020-02-05 Thread H.J. Lu
On Wed, Feb 5, 2020 at 2:51 PM H.J. Lu wrote: > > On Wed, Feb 5, 2020 at 2:37 PM Marek Polacek wrote: > > > > On Wed, Feb 05, 2020 at 02:24:48PM -0800, H.J. Lu wrote: > > > On Wed, Feb 5, 2020 at 12:20 PM H.J. Lu wrote: > > > > > > > &g

[PATCH] Use the section flag 'o' for __patchable_function_entries

2020-02-06 Thread H.J. Lu
This commit in GNU binutils 2.35: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=b7d072167715829eed0622616f6ae0182900de3e added the section flag 'o' to .section directive: .section __patchable_function_entries,"awo",@progbits,foo which specifies the symbol name which the se

Re: [PATCH] x86-64: Pass aggregates with only float/double in GPRs for MS_ABI

2020-02-06 Thread H.J. Lu
On Wed, Feb 05, 2020 at 09:51:14PM +0100, Uros Bizjak wrote: > On Wed, Feb 5, 2020 at 6:59 PM H.J. Lu wrote: > > > > MS_ABI requires passing aggregates with only float/double in integer > > registers. Checked gcc outputs against Clang and fixed: > > > > FAIL:

PING^6: [PATCH] i386: Properly encode xmm16-xmm31/ymm16-ymm31 for vector move

2020-02-06 Thread H.J. Lu
On Mon, Jan 27, 2020 at 10:59 AM H.J. Lu wrote: > > On Mon, Jul 8, 2019 at 8:19 AM H.J. Lu wrote: > > > > On Tue, Jun 18, 2019 at 8:59 AM H.J. Lu wrote: > > > > > > On Fri, May 31, 2019 at 10:38 AM H.J. Lu wrote: > > > > > &

Re: [PATCH] x86-64: Pass aggregates with only float/double in GPRs for MS_ABI

2020-02-07 Thread H.J. Lu
On Fri, Feb 7, 2020 at 2:14 AM JonY <10wa...@gmail.com> wrote: > > On 2/7/20 3:23 AM, H.J. Lu wrote: > > On Wed, Feb 05, 2020 at 09:51:14PM +0100, Uros Bizjak wrote: > >> On Wed, Feb 5, 2020 at 6:59 PM H.J. Lu wrote: > >>> > >>> MS_ABI requires p

[PATCH] i386: Properly pop restore token in signal frame

2020-02-08 Thread H.J. Lu
Linux CET kernel places a restore token on shadow stack for signal handler to enhance security. The restore token is 8 byte and aligned to 8 bytes. It is usually transparent to user programs since kernel will pop the restore token when signal handler returns. But when an exception is thrown from

[PATCH] i386: Skip ENDBR32 at nested function entry

2020-02-10 Thread H.J. Lu
Since nested function isn't only called directly, there is ENDBR32 at function entry and we need to skip it for direct jump in trampoline. Tested on Linux/x86-64 CET machine with and without -m32. gcc/ PR target/93656 * config/i386/i386.c (ix86_trampoline_init): Skip ENDBR32 at

Re: [PATCH] i386: Skip ENDBR32 at nested function entry

2020-02-10 Thread H.J. Lu
On Mon, Feb 10, 2020 at 11:40 AM Uros Bizjak wrote: > > On Mon, Feb 10, 2020 at 8:22 PM H.J. Lu wrote: > > > > Since nested function isn't only called directly, there is ENDBR32 at > > function entry and we need to skip it for direct jump in trampoline. > > Hm,

Re: [PATCH] i386: Skip ENDBR32 at nested function entry

2020-02-12 Thread H.J. Lu
On Mon, Feb 10, 2020 at 12:01 PM Uros Bizjak wrote: > > On Mon, Feb 10, 2020 at 8:53 PM H.J. Lu wrote: > > > > On Mon, Feb 10, 2020 at 11:40 AM Uros Bizjak wrote: > > > > > > On Mon, Feb 10, 2020 at 8:22 PM H.J. Lu wrote: > > > > > > >

[PATCH] i386: Also skip ENDBR32 at the target function entry

2020-02-13 Thread H.J. Lu
On Thu, Feb 13, 2020 at 09:29:32AM +0100, Uros Bizjak wrote: > On Wed, Feb 12, 2020 at 1:21 PM H.J. Lu wrote: > > > > On Mon, Feb 10, 2020 at 12:01 PM Uros Bizjak wrote: > > > > > > On Mon, Feb 10, 2020 at 8:53 PM H.J. Lu wrote: > > > > > &g

Re: [PATCH] i386: Also skip ENDBR32 at the target function entry

2020-02-13 Thread H.J. Lu
On Thu, Feb 13, 2020 at 01:28:43PM +0100, Uros Bizjak wrote: > On Thu, Feb 13, 2020 at 1:06 PM H.J. Lu wrote: > > > > On Thu, Feb 13, 2020 at 09:29:32AM +0100, Uros Bizjak wrote: > > > On Wed, Feb 12, 2020 at 1:21 PM H.J. Lu wrote: > > > > > > > &

PING^7: [PATCH] i386: Properly encode xmm16-xmm31/ymm16-ymm31 for vector move

2020-02-13 Thread H.J. Lu
On Thu, Feb 6, 2020 at 8:17 PM H.J. Lu wrote: > > On Mon, Jan 27, 2020 at 10:59 AM H.J. Lu wrote: > > > > On Mon, Jul 8, 2019 at 8:19 AM H.J. Lu wrote: > > > > > > On Tue, Jun 18, 2019 at 8:59 AM H.J. Lu wrote: > > > > > &g

Re: [PATCH] i386: Also skip ENDBR32 at the target function entry

2020-02-13 Thread H.J. Lu
On Thu, Feb 13, 2020 at 5:10 AM Uros Bizjak wrote: > > On Thu, Feb 13, 2020 at 1:42 PM H.J. Lu wrote: > > > > On Thu, Feb 13, 2020 at 01:28:43PM +0100, Uros Bizjak wrote: > > > On Thu, Feb 13, 2020 at 1:06 PM H.J. Lu wrote: > > > > > > > > On

Re: Backports to 9.3

2020-02-14 Thread H.J. Lu
On Thu, Feb 13, 2020 at 2:46 PM Jakub Jelinek wrote: > > Hi! > > I've backported following 15 commits from trunk to 9.3 branch, > bootstrapped/regtested on x86_64-linux and i686-linux, committed. > Hi Jakub, Are you preparing for GCC 9.3? I'd like to include this in GCC 9.3: https://gcc.gnu.org

Re: Backports to 9.3

2020-02-14 Thread H.J. Lu
On Fri, Feb 14, 2020 at 7:51 AM Jakub Jelinek wrote: > > On Fri, Feb 14, 2020 at 07:45:43AM -0800, H.J. Lu wrote: > > Are you preparing for GCC 9.3? I'd like to include this in GCC 9.3: > > > > https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=1d69147af203d4dcd2270429f

[PATCH 02/10] i386: Use ix86_output_ssemov for XImode TYPE_SSEMOV

2020-02-15 Thread H.J. Lu
PR target/89229 * config/i386/i386.md (*movxi_internal_avx512f): Call ix86_output_ssemov for TYPE_SSEMOV. --- gcc/config/i386/i386.md | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index f14683cd14f

[PATCH 06/10] i386: Use ix86_output_ssemov for SImode TYPE_SSEMOV

2020-02-15 Thread H.J. Lu
There is no need to set mode attribute to XImode since ix86_output_ssemov can properly encode xmm16-xmm31 registers with and without AVX512VL. gcc/ PR target/89229 * config/i386/i386.md (*movsi_internal): Call ix86_output_ssemov for TYPE_SSEMOV. Remove ext_sse_reg_operand

[PATCH 03/10] i386: Use ix86_output_ssemov for OImode TYPE_SSEMOV

2020-02-15 Thread H.J. Lu
There is no need to set mode attribute to XImode since ix86_output_ssemov can properly encode ymm16-ymm31 registers with and without AVX512VL. PR target/89229 * config/i386/i386.md (*movoi_internal_avx): Call ix86_output_ssemov for TYPE_SSEMOV. Remove ext_sse_reg_operand

[PATCH 00/10] i386: Properly encode xmm16-xmm31/ymm16-ymm31 for vector move

2020-02-15 Thread H.J. Lu
If xmm16-xmm31/ymm16-ymm31 registers are used: a. With AVX512VL, AVX512VL vector moves will be generated. b. Without AVX512VL, xmm16-xmm31/ymm16-ymm31 register to register move will be done with zmm register move. Tested on AVX2 and AVX512 with and without --with-arch=native. H.J.

[PATCH 04/10] i386: Use ix86_output_ssemov for TImode TYPE_SSEMOV

2020-02-15 Thread H.J. Lu
There is no need to set mode attribute to XImode since ix86_output_ssemov can properly encode xmm16-xmm31 registers with and without AVX512VL. gcc/ PR target/89229 * config/i386/i386.md (*movti_internal): Call ix86_output_ssemov for TYPE_SSEMOV. Remove ext_sse_reg_operand

[PATCH 05/10] i386: Use ix86_output_ssemov for DImode TYPE_SSEMOV

2020-02-15 Thread H.J. Lu
There is no need to set mode attribute to XImode since ix86_output_ssemov can properly encode xmm16-xmm31 registers with and without AVX512VL. gcc/ PR target/89229 * config/i386/i386.md (*movdi_internal): Call ix86_output_ssemov for TYPE_SSEMOV. Remove ext_sse_reg_operand

[PATCH 01/10] i386: Properly encode vector registers in vector move

2020-02-15 Thread H.J. Lu
On x86, when AVX and AVX512 are enabled, vector move instructions can be encoded with either 2-byte/3-byte VEX (AVX) or 4-byte EVEX (AVX512): 0: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 4: 62 f1 fd 08 6f d1 vmovdqa64 %xmm1,%xmm2 We prefer VEX encoding over EVEX since VEX is sho

[PATCH 08/10] i386: Use ix86_output_ssemov for DFmode TYPE_SSEMOV

2020-02-15 Thread H.J. Lu
There is no need to set mode attribute to XImode nor V8DFmode since ix86_output_ssemov can properly encode xmm16-xmm31 registers with and without AVX512VL. gcc/ PR target/89229 * config/i386/i386.md (*movdf_internal): Call ix86_output_ssemov for TYPE_SSEMOV. Remove TARGET

[PATCH 09/10] i386: Use ix86_output_ssemov for SFmode TYPE_SSEMOV

2020-02-15 Thread H.J. Lu
There is no need to set mode attribute to V16SFmode since ix86_output_ssemov can properly encode xmm16-xmm31 registers with and without AVX512VL. gcc/ PR target/89229 * config/i386/i386.md (*movdf_internal): Call ix86_output_ssemov for TYPE_SSEMOV. Remove TARGET_PREFER_AV

[PATCH 10/10] i386: Use ix86_output_ssemov for MMX TYPE_SSEMOV

2020-02-15 Thread H.J. Lu
There is no need to set mode attribute to XImode since ix86_output_ssemov can properly encode xmm16-xmm31 registers with and without AVX512VL. Remove ext_sse_reg_operand since it is no longer needed. PR target/89229 * config/i386/mmx.md (MMXMODE:*mov_internal): Call ix86_o

[PATCH 07/10] i386: Use ix86_output_ssemov for TFmode TYPE_SSEMOV

2020-02-15 Thread H.J. Lu
gcc/ PR target/89229 * config/i386/i386.md (*movtf_internal): Call ix86_output_ssemov for TYPE_SSEMOV. gcc/testsuite/ PR target/89229 * gcc.target/i386/pr89229-5a.c: New test. * gcc.target/i386/pr89229-5b.c: Likewise. * gcc.target/i386/pr89

[PATCH] i386: Use add for a = a + b and a = b + a when possible

2019-12-06 Thread H.J. Lu
From: "H.J. Lu" Date: Tue, 3 Dec 2019 15:27:51 -0800 Subject: [PATCH] i386: Use add for a = a + b and a = b + a when possible Since except for Bonnell, 01 fb add%edi,%ebx is faster and shorter than 8d 1c 1f lea(%rdi,%rbx,1),%ebx we should use add

Re: [C++ coroutines] Initial implementation pushed to master.

2024-03-05 Thread H.J. Lu
On Sat, Jan 18, 2020 at 4:54 AM Iain Sandoe wrote: > > Hi, > > Thanks to: > >* the reviewers, the code was definitely improved by your reviews. > >* those folks who tested the branch and/or compiler explorer > instance and reported problems with reproducers. > > * WG21 colleagues, e

[PATCH v2] tree-profile: Don't instrument an IFUNC resolver nor its callees

2024-03-05 Thread H.J. Lu
We can't instrument an IFUNC resolver nor its callees as it may require TLS which hasn't been set up yet when the dynamic linker is resolving IFUNC symbols. Add an IFUNC resolver caller marker to cgraph_node and set it if the function is called by an IFUNC resolver. Update tree_profiling to skip

Re: [PATCH] tree-profile: Don't instrument an IFUNC resolver nor its callees

2024-03-05 Thread H.J. Lu
On Thu, Feb 29, 2024 at 7:11 AM H.J. Lu wrote: > > On Thu, Feb 29, 2024 at 7:06 AM Jan Hubicka wrote: > > > > > > I am worried about scenario where ifunc selector calls function foo > > > > defined locally and foo is also used from other

Re: [C++ coroutines] Initial implementation pushed to master.

2024-03-06 Thread H.J. Lu
On Wed, Mar 6, 2024 at 1:03 AM Iain Sandoe wrote: > > > > > On 5 Mar 2024, at 17:31, H.J. Lu wrote: > > > > On Sat, Jan 18, 2020 at 4:54 AM Iain Sandoe wrote: > >> > > >> 2020-01-18 Iain Sandoe > >> > >>* Ma

Re: libbacktrace patch committed: Don't assume compressed section aligned

2024-03-08 Thread H.J. Lu
On Fri, Mar 8, 2024 at 2:48 PM Fangrui Song wrote: > > On ELF64, it looks like BFD uses 8-byte alignment for compressed > `.debug_*` sections while gold/lld/mold use 1-byte alignment. I do not > know how the Solaris linker sets the alignment. > > The specification's wording makes me confused wheth

PING: [PATCH v2] tree-profile: Don't instrument an IFUNC resolver nor its callees

2024-04-02 Thread H.J. Lu
On Tue, Mar 5, 2024 at 1:45 PM H.J. Lu wrote: > > We can't instrument an IFUNC resolver nor its callees as it may require > TLS which hasn't been set up yet when the dynamic linker is resolving > IFUNC symbols. > > Add an IFUNC resolver caller marker to cgraph_node and

Re: PING: [PATCH v2] tree-profile: Don't instrument an IFUNC resolver nor its callees

2024-04-02 Thread H.J. Lu
On Tue, Apr 2, 2024 at 7:50 AM Jan Hubicka wrote: > > > On Tue, Mar 5, 2024 at 1:45 PM H.J. Lu wrote: > > > > > > We can't instrument an IFUNC resolver nor its callees as it may require > > > TLS which hasn't been set up yet when the d

  1   2   3   4   5   6   7   8   9   10   >