On Thu, Sep 28, 2023 at 11:23 AM ZiNgA BuRgA wrote:
>
> That sounds about right. The code I had in mind would perhaps look like:
>
>
> #if defined(__AVX512BW__) && defined(__AVX512VL__)
> #if defined(__EVEX256__) && !defined(__EVEX512__)
> // compiled code is AVX10.1/256 and AVX512
> (apx_egpr): Likewise.
> (apx_push2pop2): Likewise.
> (apx_ndd): Likewise.
> (apx_all): Likewise.
> * doc/invoke.texi: Document mapxf.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/apx-1.c: New test.
>
> Co-aut
On Mon, Oct 9, 2023 at 10:05 AM Hongyu Wang wrote:
>
> For vec_concatv2di, m constraint in alternative 0 and 1 could result in
> egpr allocated on operand 2 under -mapxf. Should use jm instead.
>
> Bootstrapped/regtested on x86-64-linux-gnu.
>
> Ok for trunk?
Ok.
>
> gcc/ChangeLog:
>
> * c
On Tue, Oct 10, 2023 at 2:51 PM Hongyu Wang wrote:
>
> From: "Mo, Zewei"
>
> Hi,
>
> Intel APX PUSH2POP2 feature has been released in [1].
>
> This feature requires stack to be aligned at 16byte, therefore in
> prologue/epilogue, a standalone push/pop will be emitted before any
> push2/pop2 if th
On Thu, Jul 6, 2023 at 1:53 PM Uros Bizjak via Gcc-patches
wrote:
>
> On Thu, Jul 6, 2023 at 3:14 AM liuhongt wrote:
> >
> > For testcase
> >
> > void __cond_swap(double* __x, double* __y) {
> > bool __r = (*__x < *__y);
> > auto __tmp = __r ? *__x : *__y;
> > *__y = __r ? *__y : *__x;
> >
On Mon, Oct 16, 2023 at 2:25 PM Haochen Jiang wrote:
>
> Hi all,
>
> The patches aim to add new cpu archs Clear Water Forest and
> Panther Lake. Here comes the documentation:
>
> https://cdrdv2.intel.com/v1/dl/getContent/671368
>
> Also in the patches, I refactored how we detect cpu according to f
On Wed, Oct 18, 2023 at 4:33 PM liuhongt wrote:
>
Cut from subject...
There's a loop in vect_peel_nonlinear_iv_init to get init_expr * pow
(step_expr, skip_niters). When skipn_iters is too big, compile time
hogs. To avoid that, optimize init_expr * pow (step_expr, skip_niters)
to init_expr << (exa
On Wed, Oct 18, 2023 at 4:10 PM Haochen Jiang wrote:
>
> Hi all,
>
> I just found that since ISAs enabled on Sierra Forest changed, clients since
> Arrow Lake will wrongly enable ENQCMD according to the current code.
>
> To avoid messing up again in the future, I changed the dependency on how ISAs
On Mon, Oct 23, 2023 at 8:35 PM Richard Biener
wrote:
>
> On Mon, Oct 23, 2023 at 10:48 AM liuhongt wrote:
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > Ready push to trunk.
>
> vcond and vcondeq shouldn't be necessary if there's
> vcond_mask and vcmp support which is the
On Tue, Oct 24, 2023 at 10:53 AM Hongtao Liu wrote:
>
> On Mon, Oct 23, 2023 at 8:35 PM Richard Biener
> wrote:
> >
> > On Mon, Oct 23, 2023 at 10:48 AM liuhongt wrote:
> > >
> > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > >
On Tue, Oct 24, 2023 at 1:23 PM Hongtao Liu wrote:
>
> On Tue, Oct 24, 2023 at 10:53 AM Hongtao Liu wrote:
> >
> > On Mon, Oct 23, 2023 at 8:35 PM Richard Biener
> > wrote:
> > >
> > > On Mon, Oct 23, 2023 at 10:48 AM liuhongt wrote:
> > >
On Tue, Oct 24, 2023 at 6:10 PM Richard Sandiford
wrote:
>
> The files changed in this patch had tests for masked and unmasked
> popcnt. However, the mask inputs to the masked forms were undefined,
> and would be set to zero by init_regs. Any combine-like pass that
> ran after init_regs could th
On Fri, Oct 27, 2023 at 2:49 PM Richard Biener
wrote:
>
>
>
> > Am 27.10.2023 um 07:50 schrieb liuhongt :
> >
> > When 2 vectors are equal, kmask is allones and kortest will set CF,
> > else CF will be cleared.
> >
> > So CF bit can be used to check for the result of the comparison.
> >
> > Boots
On Fri, Oct 27, 2023 at 3:21 PM Hongtao Liu wrote:
>
> On Fri, Oct 27, 2023 at 2:49 PM Richard Biener
> wrote:
> >
> >
> >
> > > Am 27.10.2023 um 07:50 schrieb liuhongt :
> > >
> > > When 2 vectors are equal, kmask is allones
On Mon, Oct 30, 2023 at 3:47 PM Haochen Jiang wrote:
>
> Hi all,
>
> This patch fixed two obvious bug in current evex512 implementation.
>
> Also, I moved AVX512CD+AVX512VL part out of the AVX512VL to avoid
> accidental handle miss in avx512cd in the future.
>
> Ok for trunk?
Ok.
>
> BRs,
> Haoche
On Tue, Oct 31, 2023 at 2:39 PM Haochen Jiang wrote:
>
> Hi all,
>
> These four patches are going to fix no-evex512 function attribute. The detail
> of the issue comes following:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111889
>
> My proposal for this problem is to also push "no-evex512" w
On Fri, Nov 3, 2023 at 6:34 PM Uros Bizjak wrote:
>
> The patch generalizes address register class handling to allow multiple
> address register classes. For APX EGPR targets, some instructions can't be
> encoded with REX2 prefix, so it is necessary to limit address register
> class to avoid REX2
On Mon, Nov 6, 2023 at 7:10 PM Jan Beulich wrote:
>
> On 25.06.2023 08:41, Hongtao Liu wrote:
> > On Sun, Jun 25, 2023 at 2:35 PM Hongtao Liu wrote:
> >>
> >> On Sun, Jun 25, 2023 at 2:25 PM Jan Beulich wrote:
> >>>
> >>> On 25.06.2023 07:1
Hi uros:
This patch fixes false dependence of scalar operations
vrcp/vsqrt/vrsqrt/vrndscale.
Bootstrap ok, regression test on i386/x86 ok.
It does something like this:
-
For scalar instructions with both xmm operands:
op %xmmN,%xmmQ,%xmmQ > op %xmmN, %xmmN, %xmmQ
for scalar instruc
Update patch:
Add m constraint to define_insn (sse_1_round):
Change constraint x to xm
since vround support memory operand.
* (*sse4_1_round): Ditto.
Bootstrap and regression test ok.
On Wed, Oct 23, 2019 at 9:56 AM Hongtao Liu wrote:
>
> Hi uros:
> This patch fi
On Fri, Oct 25, 2019 at 2:39 AM Uros Bizjak wrote:
>
> On Wed, Oct 23, 2019 at 7:48 AM Hongtao Liu wrote:
> >
> > Update patch:
> > Add m constraint to define_insn (sse_1_round > *sse_1_round > when under sse4 but not avx512f.
>
> It looks to me that the origi
On Fri, Oct 25, 2019 at 1:23 PM Hongtao Liu wrote:
>
> On Fri, Oct 25, 2019 at 2:39 AM Uros Bizjak wrote:
> >
> > On Wed, Oct 23, 2019 at 7:48 AM Hongtao Liu wrote:
> > >
> > > Update patch:
> > > Add m constraint to define_insn (sse_1_round > &g
Update patch.
On Fri, Oct 25, 2019 at 4:01 PM Uros Bizjak wrote:
>
> On Fri, Oct 25, 2019 at 7:55 AM Hongtao Liu wrote:
> >
> > On Fri, Oct 25, 2019 at 1:23 PM Hongtao Liu wrote:
> > >
> > > On Fri, Oct 25, 2019 at 2:39 AM Uros Bizjak wrote:
> > &
> Looking into sse.md, there is a lot of inconsistencies in existing *vm
> patterns w.r.t. operand constraints. Unfortunately, these were copied
> into proposed patterns. One example is existing
>
> (define_insn "_vmsqrt2"
> [(set (match_operand:VF_128 0 "register_operand" "=x,v")
> (vec_merg
> BTW: Please also note that there is no need to use or operand
> mode override in scalar insn templates for intel asm dialect when
> operand already has a scalar mode.
https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01868.html
This patch is to remove redundant when operand already has a scalar mo
Hi uros:
This patch is about to fix inefficient vector constructor.
Currently in ix86_expand_vector_init_concat, vector are initialized
per 2 elements which can miss some optimization opportunity like
pr92295.
Bootstrap and i386 regression test is ok.
Ok for trunk?
Changelog
gcc/
Hi Jakub:
Could you help reviewing this patch.
PS: Since this patch is related to vectors(avx512f), and Uros
mentioned before that he has no intension to maintain avx512f.
On Fri, Nov 1, 2019 at 9:12 AM Hongtao Liu wrote:
>
> Hi uros:
> This patch is about to fix inefficie
Ping!
On Sat, Nov 2, 2019 at 9:38 PM Hongtao Liu wrote:
>
> Hi Jakub:
> Could you help reviewing this patch.
>
> PS: Since this patch is related to vectors(avx512f), and Uros
> mentioned before that he has no intension to maintain avx512f.
>
> On Fri, Nov 1, 2019 at 9:
Hi:
This patch is about to set X86_TUNE_AVX128_OPTIMAL as default for
all AVX target because we found there's still performance gap between
128-bit auto-vectorization and 256-bit auto-vectorization even with
epilog vectorized.
The performance influence of setting avx128_optimal as default on
SP
On Tue, Nov 12, 2019 at 4:19 PM Richard Biener
wrote:
>
> On Tue, Nov 12, 2019 at 8:36 AM Hongtao Liu wrote:
> >
> > Hi:
> > This patch is about to set X86_TUNE_AVX128_OPTIMAL as default for
> > all AVX target because we found there's still perfo
On Tue, Nov 12, 2019 at 4:29 PM Richard Biener
wrote:
>
> On Tue, Nov 12, 2019 at 9:19 AM Richard Biener
> wrote:
> >
> > On Tue, Nov 12, 2019 at 8:36 AM Hongtao Liu wrote:
> > >
> > > Hi:
> > > This patch is about to set X86_TUNE_AVX128_OPTIMA
Hi:
As mentioned in https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00832.html
> So yes, it's poorly named. A preparatory patch to clean this up
> (and maybe split it into TARGET_AVX256_SPLIT_REGS and TARGET_AVX128_OPTIMAL)
> would be nice.
Bootstrap and regression test for i386 backend is ok.
On Tue, Nov 12, 2019 at 4:41 PM Richard Biener
wrote:
>
> On Tue, Nov 12, 2019 at 9:29 AM Hongtao Liu wrote:
> >
> > On Tue, Nov 12, 2019 at 4:19 PM Richard Biener
> > wrote:
> > >
> > > On Tue, Nov 12, 2019 at 8:36 AM Hongtao Liu wrote:
> > >
Hi
As mentioned in PR93724, several intrinsic macros lack a closing
parenthesis. These macros are only used with -O0 option, and currently
unit tests use -O2, so not covered.
Bootstrap ok, regression tests on i386/x86_64 is ok.
Ok for trunk?
Changelog
gcc/
* config/i386/avx512vbmi2in
On Thu, Feb 13, 2020 at 5:12 PM Uros Bizjak wrote:
>
> On Thu, Feb 13, 2020 at 9:53 AM Jakub Jelinek wrote:
> >
> > On Thu, Feb 13, 2020 at 09:39:05AM +0100, Uros Bizjak wrote:
> > > > Changelog
> > > > gcc/
> > > >* config/i386/avx512vbmi2intrin.h
> > > >(_mm512_[,mask_,maskz_]sh
On Thu, Feb 13, 2020 at 5:31 PM Hongtao Liu wrote:
>
> On Thu, Feb 13, 2020 at 5:12 PM Uros Bizjak wrote:
> >
> > On Thu, Feb 13, 2020 at 9:53 AM Jakub Jelinek wrote:
> > >
> > > On Thu, Feb 13, 2020 at 09:39:05AM +0100, Uros Bizjak wrot
Done.
On Fri, Feb 14, 2020 at 7:16 PM Uros Bizjak wrote:
>
> On Fri, Feb 14, 2020 at 8:06 AM Uros Bizjak wrote:
> >
> > On Fri, Feb 14, 2020 at 7:03 AM Hongtao Liu wrote:
> > >
> > > On Thu, Feb 13, 2020 at 5:31 PM Hongtao Liu wrote:
> > > >
>
On Wed, Dec 4, 2019 at 4:22 PM Jakub Jelinek wrote:
>
> On Wed, Dec 04, 2019 at 10:07:05AM +0800, Hongtao Liu wrote:
> > Changelog
> > gcc/
> > PR target/92686
> > * config/i386/sse.md
> > (*_cmp3,
> > *_cmp3,
> > *_uc
On Thu, Dec 5, 2019 at 4:03 PM Jakub Jelinek wrote:
>
> On Thu, Dec 05, 2019 at 09:56:46AM +0800, Hongtao Liu wrote:
> > --- a/gcc/config/i386/i386-expand.c
> > +++ b/gcc/config/i386/i386-expand.c
> > + /* Using vector move with mask register. */
> > +
Hi uros:
This patch is about to rename OPTION_MASK_ISA_$target_[SET,UNSET, ]
to OPTION_MASK_ISA2_$target_[SET,UNSET, ] for those targets setting
x_ix86_isa_flags2.
target list as bellow:
-
188static struct ix86_target_opts isa2_opts[] =
189{
190 { "-mcx16", OPTION_MASK_ISA2_CX
Hi jakub:
This patch is to enable integer mask cmp/cmov under AVX512F even
with TARGET_XOP .
Bootstrap and regression test on i386/x86_64 backend is ok.
Changelog:
PR target/92865
* gcc/config/i386/i386-expand.c (ix86_valid_mask_cmp_mode): Enable
integer mask cmov when available ev
On Tue, Dec 10, 2019 at 4:11 PM Jakub Jelinek wrote:
>
> On Tue, Dec 10, 2019 at 01:47:50PM +0800, Hongtao Liu wrote:
> > This patch is to enable integer mask cmp/cmov under AVX512F even
> > with TARGET_XOP .
> > Bootstrap and regression test on i386/x86_64 backend
On Wed, Dec 11, 2019 at 3:54 PM Jakub Jelinek wrote:
>
> On Wed, Dec 11, 2019 at 09:55:24AM +0800, Hongtao Liu wrote:
> > Changelog
> > gcc/
> > PR target/92865
> > * config/i386/i386-expand.c (ix86_valid_mask_cmp_mode): Enable
> > integer mask cmov
Hi:
This patch is about to add tune option for integer mask cmov, for
some targets has both integer mask register and sse mask register,
this tune indicates to use integer one. Currently it's default on for
m_CORE_AVX512.
Bootstrap is ok, regression test on i386/x86_64 backends is ok.
ok for
Hi:
This patch is to simplify A * C + (-D) -> (A - D/C) * C when C is a
power of 2 and D mod C == 0.
bootstrap and make check is ok.
changelog
gcc/
* gcc/match.pd (A * C + (-D) = (A - D/C) * C. when C is a
power of 2 and D mod C == 0): Add new simplification.
gcc/testsuite
On Wed, Dec 18, 2019 at 10:50 AM Andrew Pinski wrote:
>
> On Tue, Dec 17, 2019 at 6:33 PM Hongtao Liu wrote:
> >
> > Hi:
> > This patch is to simplify A * C + (-D) -> (A - D/C) * C when C is a
> > power of 2 and D mod C == 0.
> > bootstrap and make ch
On Wed, Dec 18, 2019 at 4:26 PM Segher Boessenkool
wrote:
>
> On Wed, Dec 18, 2019 at 10:37:11AM +0800, Hongtao Liu wrote:
> > Hi:
> > This patch is to simplify A * C + (-D) -> (A - D/C) * C when C is a
> > power of 2 and D mod C == 0.
> > bootstrap and make c
On Thu, Feb 29, 2024 at 2:20 PM Hongtao Liu wrote:
>
> On Wed, Feb 28, 2024 at 4:54 PM Jakub Jelinek wrote:
> >
> > Hi!
> >
> > Adding Hongtao and Honza into the loop as the ones who acked the original
> > patch.
> >
> > The no_callee_saved_regist
On Tue, Mar 12, 2024 at 8:00 PM liuhongt wrote:
>
> if alignb > ASAN_RED_ZONE_SIZE and offset[0] is not multiple of
> alignb. (base_align_bias - base_offset) may not aligned to alignb, and
> caused segement fault.
>
> Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
> Ok for trunk and backpo
On Thu, Mar 14, 2024 at 3:22 PM Uros Bizjak wrote:
>
> On Thu, Mar 14, 2024 at 2:33 AM liuhongt wrote:
> >
> > When we split
> > (insn 37 36 38 10 (set (reg:DI 104 [ _18 ])
> > (mem:DI (reg/f:SI 98 [ CallNative_nclosure.0_1 ]) [6 MEM[(struct
> > SQRefCounted *)CallNative_nclosure.0_1]._u
On Thu, Mar 14, 2024 at 10:46 PM Uros Bizjak wrote:
>
> On Thu, Mar 14, 2024 at 8:42 AM Uros Bizjak wrote:
> >
> > On Thu, Mar 14, 2024 at 8:32 AM Hongtao Liu wrote:
> > >
> > > On Thu, Mar 14, 2024 at 3:22 PM Uros Bizjak wrote:
> > > >
> &g
On Thu, Mar 14, 2024 at 11:42 PM Andrew Stubbs wrote:
>
> Don't enable excess lanes when inverting vector bit-masks smaller than the
> integer mode. This is yet another case of wrong-code due to mishandling
> of oversized bitmasks.
>
> This issue shows up in vect/tsvc/vect-tsvc-s278.c and
> vect/
On Mon, Mar 18, 2024 at 6:59 PM Uros Bizjak wrote:
>
> On Mon, Mar 18, 2024 at 11:52 AM liuhongt wrote:
> >
> > Commit r14-9459-g618e34d56cc38e only handles
> > general_scalar_chain::convert_op. The patch also handles
> > timode_scalar_chain::convert_op to avoid potential similar bug.
> >
> > Boo
On Tue, Mar 19, 2024 at 12:16 AM Joseph Myers wrote:
>
> On Mon, 18 Mar 2024, liuhongt wrote:
>
> > +If @option{-fexcess-precision=16} is specified, casts and assignments of
> > +@code{_Float16} and @code{bfloat16_t} cause value to be rounded to their
> > +semantic types if they're supported by th
On Mon, Mar 25, 2024 at 8:51 PM Jakub Jelinek wrote:
>
> On Tue, Mar 12, 2024 at 07:57:59PM +0800, liuhongt wrote:
> > if alignb > ASAN_RED_ZONE_SIZE and offset[0] is not multiple of
> > alignb. (base_align_bias - base_offset) may not aligned to alignb, and
> > caused segement fault.
> >
> > Boots
On Tue, Mar 26, 2024 at 11:26 AM Hongtao Liu wrote:
>
> On Mon, Mar 25, 2024 at 8:51 PM Jakub Jelinek wrote:
> >
> > On Tue, Mar 12, 2024 at 07:57:59PM +0800, liuhongt wrote:
> > > if alignb > ASAN_RED_ZONE_SIZE and offset[0] is not multiple of
> > > alig
On Mon, Apr 8, 2024 at 11:44 PM H.J. Lu wrote:
>
> Define following macros for APX options:
>
> 1. __APX_EGPR__: -mapx-features=egpr.
> 2. __APX_PUSH2POP2__: -mapx-features=push2pop2.
> 3. __APX_NDD__: -mapx-features=ndd.
> 4. __APX_PPX__: -mapx-features=ppx.
For -mapx-features=, we haven't decide
On Tue, Apr 9, 2024 at 9:58 AM H.J. Lu wrote:
>
> Define __APX_INLINE_ASM_USE_GPR32__ for -mapx-inline-asm-use-gpr32.
> When __APX_INLINE_ASM_USE_GPR32__ is defined, inline asm statements
> should contain only instructions compatible with r16-r31.
Ok.
>
> gcc/
>
> PR target/114587
>
On Thu, Apr 4, 2024 at 4:42 PM Jakub Jelinek wrote:
>
> On Wed, Apr 19, 2023 at 02:40:59AM +, Jiang, Haochen via Gcc-patches
> wrote:
> > > > (define_insn "aesenc"
> > > > - [(set (match_operand:V2DI 0 "register_operand" "=x,x")
> > > > - (unspec:V2DI [(match_operand:V2DI 1 "register_
On Tue, Apr 9, 2024 at 5:18 PM Jakub Jelinek wrote:
>
> On Tue, Apr 09, 2024 at 11:23:40AM +0800, Hongtao Liu wrote:
> > I think we can merge alternative 2 with 3 to
> > * return TARGET_AES ? \"vaesenc\t{%2, %1, %0|%0, %1, %2}"\" :
> > \"%{evex%} vae
On Tue, Apr 9, 2024 at 3:05 PM Hongyu Wang wrote:
>
> The latest APX spec announced removal of SHA/KEYLOCKER evex promotion [1],
> which means the SHA/KEYLOCKER insn does not support EGPR when APX
> enabled. Update the corresponding constraints to their EGPR-disabled
> counterparts.
>
> Bootstrapp
On Tue, Feb 6, 2024 at 11:49 AM H.J. Lu wrote:
>
> 1. The only supported TLS code sequence with ADD is
>
> addq foo@gottpoff(%rip),%reg
>
> Change je constraint to a memory operand in APX NDD ADD pattern with
> register source operand.
>
> 2. The instruction length of APX NDD instructions
On Wed, Feb 14, 2024 at 5:33 AM H.J. Lu wrote:
>
> Since push2/pop2 requires 16-byte stack alignment, don't generate them
> if the incoming stack isn't 16-byte aligned.
Ok.
>
> gcc/
>
> PR target/113912
> * config/i386/i386.cc (ix86_can_use_push2pop2): New.
> (ix86_pro_and_
On Thu, Feb 22, 2024 at 10:33 PM H.J. Lu wrote:
>
> On Sun, Feb 18, 2024 at 8:02 AM H.J. Lu wrote:
> >
> > If assembler and linker supports
> >
> > add %reg1, name@gottpoff(%rip), %reg2
> >
> > with R_X86_64_CODE_6_GOTTPOFF, we can generate it instead of
> >
> > mov name@gottpoff(%rip), %reg2
> >
On Mon, Feb 26, 2024 at 5:11 AM H.J. Lu wrote:
>
> ldtilecfg and sttilecfg take a 512-byte memory block. With
> _tile_loadconfig implemented as
>
> extern __inline void
> __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> _tile_loadconfig (const void *__config)
> {
> __asm__ v
On Mon, Feb 26, 2024 at 10:37 AM H.J. Lu wrote:
>
> On Sun, Feb 25, 2024 at 6:03 PM Hongtao Liu wrote:
> >
> > On Mon, Feb 26, 2024 at 5:11 AM H.J. Lu wrote:
> > >
> > > ldtilecfg and sttilecfg take a 512-byte memory block. With
> > > _tile_loadconf
On Mon, Feb 26, 2024 at 11:26 AM wrote:
>
> From: Pan Li
>
> We allowed vector type for get_stored_val when read is less than or
> equal to store in previous. Unfortunately, we missed to adjust the
> validate_subreg part accordingly. For vector type, we don't need to
> restrict the mode size is
MODE_NATURAL_SIZE (imode);
>
> Pan
>
> -Original Message-
> From: Hongtao Liu
> Sent: Monday, February 26, 2024 11:41 AM
> To: Li, Pan2
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com;
> richard.guent...@gmail.com; Wang, Yanzhang ;
> rda
On Mon, Feb 26, 2024 at 6:30 PM H.J. Lu wrote:
>
> On Sun, Feb 25, 2024 at 8:25 PM H.J. Lu wrote:
> >
> > On Sun, Feb 25, 2024 at 7:03 PM Hongtao Liu wrote:
> > >
> > > On Mon, Feb 26, 2024 at 10:37 AM H.J. Lu wrote:
> > > >
> >
On Tue, Feb 27, 2024 at 3:44 PM Richard Biener wrote:
>
> On Tue, 27 Feb 2024, haochen.jiang wrote:
>
> > On Linux/x86_64,
> >
> > af66ad89e8169f44db723813662917cf4cbb78fc is the first bad commit
> > commit af66ad89e8169f44db723813662917cf4cbb78fc
> > Author: Richard Biener
> > Date: Fri Feb 23
On Wed, Feb 28, 2024 at 4:54 PM Jakub Jelinek wrote:
>
> Hi!
>
> Adding Hongtao and Honza into the loop as the ones who acked the original
> patch.
>
> The no_callee_saved_registers by default for noreturn functions change can
> break in-process backtrace(3) or backtraces from debugger or other pr
On Tue, Jan 9, 2024 at 3:09 PM Hongyu Wang wrote:
>
> Hi,
>
> For APX, the inline asm behavior was not mentioned in any document
> before. Add description for it.
>
> Ok for trunk?
>
> gcc/ChangeLog:
>
> * config/i386/i386.opt: Adjust document.
> * doc/invoke.texi: Add description
On Thu, Jan 11, 2024 at 7:06 AM Andi Kleen wrote:
>
> Hongtao Liu writes:
> >>
> >> +@opindex mapx-inline-asm-use-gpr32
> >> +@item -mapx-inline-asm-use-gpr32
> >> +When APX_F enabled, EGPR usage was by default disabled to prevent
> >> +unexp
On Fri, Jan 12, 2024 at 10:55 AM Jiang, Haochen wrote:
>
> > -Original Message-
> > From: Richard Biener
> > Sent: Thursday, January 11, 2024 4:19 PM
> > To: Liu, Hongtao
> > Cc: Jiang, Haochen ; gcc-patches@gcc.gnu.org;
> > ubiz...@gmail.com; bur...@net-b.de; san...@codesourcery.com
> >
On Thu, Jan 11, 2024 at 12:06 AM H.J. Lu wrote:
>
> On Tue, Jan 9, 2024 at 6:02 PM liuhongt wrote:
> >
> > After r14-2692-g1c6231c05bdcca, the option is defined as EnumSet and
> > -fcf-protection=branch won't unset any others bits since they're in
> > different groups. So to override -fcf-protect
On Wed, Jan 17, 2024 at 5:59 AM Roger Sayle wrote:
>
>
> I thought I'd just missed the bug fixing season of stage3, but there
> appears to a little latitude in early stage4 (for vector patches), so
> I'll post this now.
>
> This patch resolves PR target/106060 by providing efficient methods for
>
On Wed, Jan 10, 2024 at 12:47 AM H.J. Lu wrote:
>
> When -fsanitize=hwaddress is used, libhwasan will try to enable LAM_U57
> in the startup code. Update the target check to enable hwaddress tests
> if LAM_U57 is enabled. Also compile hwaddress tests with -mlam=u57 on
> x86-64 since hwasan requi
On Sat, Jan 20, 2024 at 10:30 PM H.J. Lu wrote:
>
> When an interrupt handler is implemented by an assembly stub which does:
>
> 1. Save all registers.
> 2. Call a C function.
> 3. Restore all registers.
> 4. Return from interrupt.
>
> it is completely unnecessary to save and restore any registers
On Mon, Jan 22, 2024 at 10:31 AM Haochen Jiang wrote:
>
> Hi all,
>
> Recently, I happened to run i386.exp under -DDEBUG and found some fail.
>
> This patch aims to fix that. Ok for trunk?
OK.
>
> Thx,
> Haochen
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/adx-check.h: Include stdio.
On Tue, Jan 23, 2024 at 11:00 PM H.J. Lu wrote:
>
> Changes in v3:
>
> 1. Rebase against commit 02e68389494
> 2. Don't add call_no_callee_saved_registers to machine_function since
> all callee-saved registers are properly clobbered by callee with
> no_callee_saved_registers attribute.
>
The patch
constants.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures. Ok for mainline (in stage 1)?
Ok, thanks for handling this.
>
>
> 2024-01-25 Roger Sayle
>
On Wed, Dec 13, 2023 at 7:59 PM Jakub Jelinek wrote:
>
> On Fri, Dec 08, 2023 at 03:12:00PM +0800, liuhongt wrote:
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > Ready push to trunk.
> >
> > gcc/ChangeLog:
> >
> > PR target/112904
> > * config/i386/mmx.md (*xop_pcmov
On Thu, Dec 14, 2023 at 10:55 AM Haochen Jiang wrote:
>
> Hi all,
>
> According to ISE050 published at the end of September, RAO-INT will not
> be in Grand Ridge anymore. This patch aims to remove it.
>
> The documentation comes following:
>
> https://cdrdv2.intel.com/v1/dl/getContent/671368
>
> R
On Thu, Dec 14, 2023 at 3:54 PM Hongyu Wang wrote:
>
> Hi,
>
> Currently move_max follows the tuning feature first, but ideally it
> should sync with prefer-vector-width when it is explicitly set to keep
> vector move and operation with same vector size.
>
> Bootstrapped/regtested on x86-64-pc-lin
On Fri, Dec 15, 2023 at 10:34 AM Haochen Jiang wrote:
>
> Hi all,
>
> There is a recent change in AVX10 documentation which allows 64 bit mask
> register instructions in AVX10-256, the documentation comes following:
>
> Intel Advanced Vector Extensions 10 (Intel AVX10) Architecture Specification
>
On Fri, Dec 22, 2023 at 6:25 PM Roger Sayle wrote:
>
>
> This patch resolves the second part of PR target/112992, building upon
> Hongtao Liu's solution to the first part.
>
> The issue addressed by this patch is that when initializing vectors by
> broadcasting integer constants, the compiler has
get/i386/pr100865-5a.c: Likewise.
> * gcc.target/i386/pr100865-5b.c: Likewise.
> * gcc.target/i386/pr100865-9a.c: Likewise.
> * gcc.target/i386/pr100865-9b.c: Likewise.
> * gcc.target/i386/pr102021.c: Likewise.
> * gcc.target/i386/pr90773-17.c: Likewise.
&
On Thu, Dec 14, 2023 at 12:03 AM Jan Hubicka wrote:
>
> > > The diffrerence is that Cores understand the fact that fmadd does not need
> > > all three parameters to start computation, while Zen cores doesn't.
> > >
> > > Since this seems noticeable win on zen and not loss on Core it seems like
>
On Mon, Jan 8, 2024 at 11:09 AM Hongyu Wang wrote:
>
> Hi,
>
> The supported sub-features for APX was missing in option document and
> target attribute section. Add those missing ones.
>
> Ok for trunk?
Ok.
>
> gcc/ChangeLog:
>
> * config/i386/i386.opt: Add supported sub-features.
>
On Fri, Dec 1, 2023 at 10:26 PM Richard Biener
wrote:
>
> On Fri, Dec 1, 2023 at 3:39 AM liuhongt wrote:
> >
> > > Hmm, I would suggest you put reg_needed into the class and accumulate
> > > over all vec_construct, with your patch you pessimize a single v32qi
> > > over two separate v16qi for exa
On Tue, Dec 5, 2023 at 10:32 AM Hongyu Wang wrote:
>
> Hi,
>
> APX NDD patches have been posted at
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636604.html
>
> Thanks to Hongtao's review, the V2 patch adds support of zext sematic with
> memory input as NDD by default clear upper bits
On Mon, Dec 4, 2023 at 3:51 PM Uros Bizjak wrote:
>
> On Mon, Dec 4, 2023 at 8:11 AM Hongtao Liu wrote:
> >
> > On Fri, Dec 1, 2023 at 10:26 PM Richard Biener
> > wrote:
> > >
> > > On Fri, Dec 1, 2023 at 3:39 AM liuhongt wrote:
> > > >
On Wed, Dec 6, 2023 at 6:23 AM Jakub Jelinek wrote:
>
> Hi!
>
> Regardless of the outcome of the REG_UNUSED discussions, I think
> it is a good idea to move the vzeroupper pass one pass later.
> As can be seen in the multiple PRs and as postreload.cc documents,
> reload/LRA is known to create dead
On Mon, Dec 4, 2023 at 10:10 PM Richard Biener
wrote:
>
> On Mon, Dec 4, 2023 at 6:32 AM liuhongt wrote:
> >
> > .i.e. for below cases.
> >a[0] = b1;
> >a[1] = b2;
> >..
> >a[n] = bn;
> >
> > There're extra dependences when contructing the vector, but not for
> > scalar store. Acc
On Wed, Dec 6, 2023 at 8:11 PM Uros Bizjak wrote:
>
> On Wed, Dec 6, 2023 at 9:08 AM Hongyu Wang wrote:
> >
> > Hi,
> >
> > Following up the discussion of V2 patches in
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639368.html,
> > this patch series add early clobber for all TImode
ping.
On Thu, Nov 16, 2023 at 6:49 PM liuhongt wrote:
>
> Update in V2:
> 1) Add some comments before the pattern.
> 2) Remove ? from view_convert.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> When I'm working on PR112443, I notice there's some misoptimization
On Wed, Dec 6, 2023 at 3:52 PM Richard Biener
wrote:
>
> On Wed, Dec 6, 2023 at 3:33 AM Jiang, Haochen wrote:
> >
> > > -Original Message-
> > > From: Jiang, Haochen
> > > Sent: Friday, December 1, 2023 4:51 PM
> > > To: Richard Biener
> > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ;
>
On Mon, Dec 11, 2023 at 4:14 PM Richard Biener
wrote:
>
> On Mon, Dec 11, 2023 at 7:51 AM liuhongt wrote:
> >
> > > since you are looking at TYPE_PRECISION below you want
> > > VECTOR_INTIEGER_TYPE_P here as well? The alternative
> > > would be to compare TYPE_SIZE.
> > >
> > > Some of the check
On Mon, Dec 11, 2023 at 8:39 PM Hongyu Wang wrote:
>
> > > +__int128 u128_2 = (9223372036854775808 << 4) * foo0_u8_0; /* {
> > > dg-warning "integer constant is so large that it is unsigned" "so large"
> > > } */
> >
> > Just you can use (9223372036854775807LL + (__int128) 1) instead of
> >
On Fri, Dec 8, 2023 at 10:17 AM liuhongt wrote:
>
> If the function desn't clobber any sse registers or only clobber
> 128-bit part, then vzeroupper isn't issued before the function exit.
> the status not CLEAN but ANY after the function.
>
> Also for sibling_call, it's safe to issue an vzeroupper
1 - 100 of 1383 matches
Mail list logo