On Wed, Oct 21, 2020 at 4:45 PM Qing Zhao wrote:
>
>
>
> On Oct 21, 2020, at 3:03 AM, Uros Bizjak wrote:
>
> On Wed, Oct 21, 2020 at 9:18 AM Uros Bizjak wrote:
>
>
> On Tue, Oct 20, 2020 at 10:04 PM Qing Zhao wrote:
>
> +/* Check whether the register REGNO should be zeroed on X86.
> + When AL
On Wed, Oct 21, 2020 at 5:15 PM Jakub Jelinek wrote:
>
> On Wed, Sep 30, 2020 at 06:06:31PM +0200, Florian Weimer wrote:
> > --- a/gcc/common/config/i386/i386-common.c
> > +++ b/gcc/common/config/i386/i386-common.c
> > @@ -1795,9 +1795,13 @@ const pta processor_alias_table[] =
> > PTA_MMX | P
On Thu, Oct 22, 2020 at 4:47 PM Qing Zhao wrote:
>
> Hi, Uros,
>
> > On Oct 21, 2020, at 9:45 AM, Qing Zhao via Gcc-patches
> > wrote:
>
> >>
> >> Something like this:
> >>
> >> --cut here--
> >> long double
> >> __attribute__ ((noinline))
> >> test (long double a, long double b)
> >> {
> >
On Sat, Oct 24, 2020 at 6:05 PM Qing Zhao wrote:
>
> Hi,
>
> This is the 4th version of the implementation of patch -fzero-call-used-regs.
>
> The major change compared to the previous version are:
>
> 1. Documentation change per Richard’s suggestion;
> 2. Command sub options handling per Richar
On Mon, Oct 26, 2020 at 3:45 PM Qing Zhao wrote:
>
>
> +/* Generate insns to zero all st/mm registers together.
> + Return true when zeroing instructions are generated.
> + Assume the number of st registers that are zeroed is num_of_st,
> + we will emit the following sequence to zero them to
On Mon, Oct 26, 2020 at 6:30 PM Qing Zhao wrote:
>
>
> The following is the current change in i386.c, could you check whether the
> logic is good?
x87 handling looks good to me.
One remaining question: If the function uses MMX regs (either
internally or as an argument register), but exits in x8
On Mon, Oct 26, 2020 at 8:10 PM Qing Zhao wrote:
>
>
>
> > On Oct 26, 2020, at 1:42 PM, Uros Bizjak wrote:
> >
> > On Mon, Oct 26, 2020 at 6:30 PM Qing Zhao wrote:
> >>
> >>
> >> The following is the current change in i386.c, could you check whether the
> >> logic is good?
> >
> > x87 handling
On Mon, Oct 26, 2020 at 9:05 PM Uros Bizjak wrote:
>
> On Mon, Oct 26, 2020 at 8:10 PM Qing Zhao wrote:
> >
> >
> >
> > > On Oct 26, 2020, at 1:42 PM, Uros Bizjak wrote:
> > >
> > > On Mon, Oct 26, 2020 at 6:30 PM Qing Zhao wrote:
> > >>
> > >>
> > >> The following is the current change in i386
On Tue, Oct 27, 2020 at 12:08 AM Qing Zhao wrote:
>
> Hi, Uros,
>
> Could you please check the change compared to the previous version for i386.c
> as following:
> Let me know any issue there.
It looks that the combination when the function only touches MMX
registers (so, no x87 register is touc
On Tue, Oct 27, 2020 at 5:10 PM Qing Zhao wrote:
>
> Uros,
>
> The following is the change compared to version 4 after fix all the issues
> you raised in the previous email.
>
> Let me know if there is any other issue.
LGTM for x86 part, with a couple of small review comments inline.
Thanks,
Ur
On Wed, Oct 28, 2020 at 10:54 AM Hongyu Wang wrote:
>
> Hi Uros,
>
> Thanks for the example. We've update the patterns with new expanders
> and predicates like vzeroall.
> Now the generated insn for "encodekey128u32" is like
>
> (insn 7 6 8 2 (parallel [
> (set (reg:SI 84 [ ])
>
On Thu, Oct 29, 2020 at 12:55 AM Qing Zhao wrote:
>
> Hi,
>
> This is the 5th version of the implementation of patch -fzero-call-used-regs.
>
> The major change compared to the previous version (4th version) are:
>
> 1. Documentation change per Richard’s suggestion;
> 2. Use namespace for zero_reg
On Thu, Oct 29, 2020 at 7:52 AM Hongyu Wang wrote:
>
> Hi Uros,
>
> > is there a reason to introduce all these (with corresponding changes)?
> > SSE options live in ISA bitmap, so it is kind of strange you need to
> > handle them in ISA2 bitmap. Option handling is not exactly my area,
> > please a
> -fstack-usage raises a "stack usage computation not supported for this target"
> warning when it encounters a naked function because the prologue returns early
> for naked function on i386. This patch sets the stack usage to zero for naked
> function, following the fix done for Arm by Eric Botcaz
> As observed a number of years ago in the following thread, i386/i386elf.h has
> not been
> kept up to date:
>
> https://gcc.gnu.org/pipermail/gcc/2013-August/209981.html
>
> This patch does the following cleanup:
>
> 1. The return convention now follows the i386 and x86_64 SVR4 ABIs again. As
>
On Sat, Nov 5, 2022 at 12:25 PM Richard Biener
wrote:
>
> On Wed, Nov 2, 2022 at 1:46 PM Uros Bizjak wrote:
> >
> > On Wed, Nov 2, 2022 at 1:45 PM Robin Dapp wrote:
> > >
> > > > IIRC, I was trying to "fix" modeless operand by giving it a mode, but
> > > > since it made no difference for x86, I
On Mon, Nov 7, 2022 at 2:41 AM Haochen Jiang wrote:
>
> gcc/ChangeLog:
>
> * config/i386/i386-options.cc (m_CORE_ATOM): New.
> * config/i386/x86-tune.def
> (X86_TUNE_SCHEDULE): Initial tune for CORE_ATOM.
> (X86_TUNE_PARTIAL_REG_DEPENDENCY): Ditto.
> (X86_TU
On Sun, Nov 6, 2022 at 2:00 PM Kong, Lingling via Gcc-patches
wrote:
>
> Hi
>
> The patch is to add flag -mprefer-remote-atomic to control whether to
> generate raoint insn for atomic operations.
> Ok for trunk?
Please note TARGET_AVOID_MFENCE tuning flag, introduced a while ago
due to the fact
On Tue, Nov 8, 2022 at 11:42 AM Jakub Jelinek wrote:
>
> Hi!
>
> For integer vector comparisons without XOP before AVX512{F,VL} we are
> constrained by only GT and EQ being supported in HW.
> For GTU we play tricks to implement it using GT or unsigned saturating
> subtraction, for LT/LTU we swap t
On Thu, Nov 10, 2022 at 10:29 AM Jakub Jelinek wrote:
>
> Hi!
>
> The following patch fixes ICE on the testcase. I've used GEN_INT
> incorrectly thinking the code punts on the problematic boundaries.
> It does, but only for LE and GE, i.e. signed comparisons, for unsigned
> the boundaries are 0 a
On Mon, Nov 14, 2022 at 8:48 AM Jakub Jelinek wrote:
>
> Hi!
>
> Working virtually out of Baker Island.
>
> We got a response from AMD in
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688#c10
> so the following patch starts treating AMD with AVX and CMPXCHG16B
> ISAs like Intel by using vmovdq
On Mon, Nov 14, 2022 at 8:52 AM Jakub Jelinek wrote:
>
> Hi!
>
> Working virtually out of Baker Island.
>
> Given
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688#c10
> the following patch implements atomic load/store (and therefore also
> enabling compare and exchange) for -m64 -mcx16 -mavx.
INSERTPS can select any element from src and insert into any place
of the dest. For SSE4.1 targets, compiler can generate e.g.
insertps $64, %xmm0, %xmm1
to insert element 1 from %xmm1 to element 0 of %xmm0.
gcc/ChangeLog:
PR target/94908
* config/i386/i386-builtin.def (__builtin_i
On Tue, Apr 18, 2023 at 7:20 PM Jakub Jelinek wrote:
>
> On Mon, Apr 17, 2023 at 11:27:28PM +0200, Uros Bizjak via Gcc-patches wrote:
> > --- a/gcc/rtl.h
> > +++ b/gcc/rtl.h
> > @@ -1972,6 +1972,13 @@ set_regno_raw (rtx x, unsigned int regno, unsigned
> > int
On Wed, Apr 19, 2023 at 1:33 AM Andrew Pinski via Gcc-patches
wrote:
>
> After a phiopt change, I got a failure of cmov9.c.
> The RTL IR has zero_extend on the outside of
> the if_then_else rather than on the side. Both
> ways are considered canonical as mentioned in
> PR 66588.
>
> This fixes the
Following code:
typedef __SIZE_TYPE__ size_t;
struct S1s
{
char pad1;
char val;
short pad2;
};
extern char ts[256];
_Bool foo (struct S1s a, size_t i)
{
return (ts[i] > a.val);
}
compiles with -O2 to:
movl%edi, %eax
movsbl %ah, %edi
cmpb%dil, ts(%rsi)
Introduce extract_operator predicate to handle both, zero-extract and
sign-extract extract operations with expressions like:
(subreg:QI
(zero_extract:SWI248
(match_operand 1 "int248_register_operand" "0")
(const_int 8)
(const_int 8)) 0)
As shown in the testcase, this wil
gcc/ChangeLog:
* config/arm/arm.cc (thumb1_legitimate_address_p):
Use VIRTUAL_REGISTER_P predicate.
(arm_eliminable_register): Ditto.
* config/avr/avr.md (push_1): Ditto.
* config/bfin/predicates.md (register_no_elim_operand): Ditto.
* config/h8300/predicates.md (register_n
x86 was converted to TARGET_LEGITIMATE_ADDRESS_P long ago. Remove
remnants of the conversion. Also, cleanup the remaining macros a bit
by introducing INDEX_REGNO_P macro.
No functional change.
gcc/ChangeLog:
2023-04-21 Uroš Bizjak
* config/i386/i386.h (REG_OK_FOR_INDEX_P, REG_OK_FOR_BA
On Sun, Apr 23, 2023 at 6:48 PM Segher Boessenkool
wrote:
>
> This minimal patch enables LRA for all targets. It does not clean up
> the target code, nor does it do anything to generic code: it just
> deletes all target definitions of TARGET_LRA_P.
>
> There are three kinds of changes:
>
> 1) Tar
On Mon, Apr 24, 2023 at 11:19 AM Segher Boessenkool
wrote:
>
> On Sun, Apr 23, 2023 at 11:06:41PM +0200, Uros Bizjak wrote:
> > > I send this patch now so that people can start testing. I don't plan to
> > > commit this for another week at least, for a week after GCC 13 release I
> > > guess? Ho
Use the same approach as in register_no_elim_operand predicate, but also
reject stack_pointer_rtx operands.
gcc/ChangeLog:
* config/i386/predicates.md (index_register_operand): Reject
arg_pointer_rtx, frame_pointer_rtx, stack_pointer_rtx and
VIRTUAL_REGISTER_P operands. Allow subregs
The predicates of ashift to lea post-reload splitter were too broad
so the splitter tried to convert the mask shift instruction. Tighten
operand predicates to match only general registers.
gcc/ChangeLog:
PR target/109733
* config/i386/predicates.md (index_reg_operand): New predicate.
For SSE2 targets the expander unpacks input elements into the correct
position in the V4SI vector and emits PMULUDQ instruction. The output
elements are then shuffled back to their positions in the V2SI vector.
For SSE4 targets PMULLD instruction is emitted directly.
gcc/ChangeLog:
* config
Rename index_register_operand predicate to what it really does.
No functional change.
gcc/ChangeLog:
* config/i386/predicates.md (register_no_SP_operand):
Rename from index_register_operand.
(call_register_operand): Update for rename.
* config/i386/i386.md (*lea_general_[1234]):
On Sat, May 6, 2023 at 4:00 PM Roger Sayle wrote:
>
>
> Hi Uros,
> This is a repost/respin of a patch that was conditionally approved:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609470.html
>
> This patch adds a convenient post-reload splitter for setting/updating
> the highpart of a
On Tue, Jan 31, 2023 at 9:02 AM Jakub Jelinek wrote:
>
> Hi!
>
> The following testcase is miscompiled. The problem is that during
> RTL DSE we see a V4DI register is being loaded { 16, 16, 0, 0 }
> value and DSE mostly works in terms of scalar modes, so it calls
> movoi to set an OImode REG to (
On Thu, Feb 9, 2023 at 4:43 PM Jakub Jelinek wrote:
>
> On Thu, Feb 09, 2023 at 07:30:52AM -0800, H.J. Lu wrote:
> > On Thu, Feb 9, 2023 at 4:12 AM Jakub Jelinek wrote:
> > > get_available_features doesn't depend on cpu_model2->__cpu_{family,model}
> > > and just sets stuff up based on CPUID leaf
On Thu, May 26, 2022 at 8:41 PM Roger Sayle wrote:
>
>
> A common idiom for testing if a specific set of bits is set in a value
> is to use "(X & Y) == Y", which on x86 results in an AND followed by a
> CMP. A slightly improved implementation is to instead use (~X & Y)==0,
> that uses a NOT and a
On Fri, May 27, 2022 at 10:05 AM Jan Beulich wrote:
>
> It's pretty clear that the operand numbers in the MEM_P() checks are
> off by one, perhaps due to a copy-and-paste oversight (unlike in most
> other places here we're dealing with two outputs).
> ---
> What I don't understand is why operand 2
On Fri, May 27, 2022 at 10:13 AM Jan Beulich wrote:
>
> Like noticed for gas as well (binutils-gdb commit c8cad9d389b7), the
> "absolute difference" aspect of the insns makes their source operands
> commutative.
You will need to expand via ix86_fixup_binary_operands_no_copy, use
register_mmxmem_o
On Mon, May 30, 2022 at 9:59 AM Jan Beulich wrote:
>
> On 27.05.2022 11:05, Uros Bizjak wrote:
> > On Fri, May 27, 2022 at 10:13 AM Jan Beulich wrote:
> >>
> >> Like noticed for gas as well (binutils-gdb commit c8cad9d389b7), the
> >> "absolute difference" aspect of the insns makes their source o
On Mon, May 30, 2022 at 11:11 AM Roger Sayle wrote:
>
>
> Hi Uros,
> This is a ping of my patch from April, which as you've suggested should be
> submitted
> for review even if there remain two missed-optimization regressions on ia32
> (to
> allow reviewers to better judge if those fixes are appro
On Mon, May 30, 2022 at 11:18 AM Uros Bizjak wrote:
>
> On Mon, May 30, 2022 at 11:11 AM Roger Sayle
> wrote:
> >
> >
> > Hi Uros,
> > This is a ping of my patch from April, which as you've suggested should be
> > submitted
> > for review even if there remain two missed-optimization regressions
i386: Remove constraints when used with constant integer predicates, take 2
const_int_operand and other const*_operand predicates do not need
constraints when the constraint is inherited from the range of
constant integer predicate. Remove the constraint in case all
alternatives use the same inhe
On Mon, May 30, 2022 at 7:50 PM Roger Sayle wrote:
>
>
> This patch resolves PR rtl-optimization/101617 where we should generate
> the exact same code for (X ? -1 : 1) as we do for ((X ? -1 : 0) | 1).
> The cause of the current difference on x86_64 is actually in
> ix86_expand_int_movcc that doesn
On Mon, May 30, 2022 at 3:22 PM Roger Sayle wrote:
>
>
> This patch is a form of insurance policy in case my patch for PR 7061 runs
> into problems on non-x86 targets; the middle-end can add an extra check
> that the backend is happy placing SCmode and DImode values in the same
> register, before
On Mon, May 30, 2022 at 10:12 PM Uros Bizjak wrote:
>
> On Mon, May 30, 2022 at 3:22 PM Roger Sayle
> wrote:
> >
> >
> > This patch is a form of insurance policy in case my patch for PR 7061 runs
> > into problems on non-x86 targets; the middle-end can add an extra check
> > that the backend is
On Thu, Jun 2, 2022 at 10:00 AM Jakub Jelinek wrote:
>
> Hi!
>
> As the following testcase shows, our x86 backend support for optimizing
> out useless masking of shift/rotate counts when using instructions
> that naturally modulo the count themselves is insufficient.
> The *_mask define_insn_and_s
On Thu, Jun 2, 2022 at 9:20 AM Roger Sayle wrote:
>
> The simple test case below demonstrates an interesting register
> allocation challenge facing x86_64, imposed by ABI requirements
> on int128.
>
> __int128 foo(__int128 x, __int128 y)
> {
> return x+y;
> }
>
> For which GCC currently generate
On Thu, Jun 2, 2022 at 11:32 AM Uros Bizjak wrote:
>
> On Thu, Jun 2, 2022 at 9:20 AM Roger Sayle wrote:
> >
> > The simple test case below demonstrates an interesting register
> > allocation challenge facing x86_64, imposed by ABI requirements
> > on int128.
> >
> > __int128 foo(__int128 x, __in
On Thu, Jun 2, 2022 at 5:00 PM Jan Beulich wrote:
>
> Like noticed for gas as well (binutils-gdb commit c8cad9d389b7), the
> "absolute difference" aspect of the insns makes their source operands
> commutative.
>
> gcc/
>
> * config/i386/mmx.md (mmx_psadbw): Convert to expander.
> (
On Fri, Jun 3, 2022 at 11:49 AM Roger Sayle wrote:
>
>
> Technically, PR target/91681 has already been resolved; we now recognize the
> highpart multiplication at the tree-level, we no longer use the stack, and
> we currently generate the same number of instructions as LLVM. However, it
> is stil
On Fri, Jun 3, 2022 at 12:08 PM Uros Bizjak wrote:
>
> On Fri, Jun 3, 2022 at 11:49 AM Roger Sayle
> wrote:
> >
> >
> > Technically, PR target/91681 has already been resolved; we now recognize the
> > highpart multiplication at the tree-level, we no longer use the stack, and
> > we currently gen
On Thu, Jun 2, 2022 at 5:11 PM Jan Beulich wrote:
>
> The length attribute ought to be "the (bounding maximum) length of an
> instruction" according to the comment next to its definition. A register
> operand encoded using the ModR/M.rm field will additionally use VEX.B
> for encoding the highest
On Fri, Jun 3, 2022 at 12:17 PM Jakub Jelinek wrote:
>
> Hi!
>
> My PR105778 patch apparently broke the following testcase.
> If the mask has the top relevant bit clear (i.e. we know we are shifting
> by 0 to wordsize bits - 1) but doesn't have all the bits below it set,
> we emit andsi3 before th
On Fri, Jun 3, 2022 at 12:38 PM Jakub Jelinek wrote:
>
> On Fri, Jun 03, 2022 at 12:23:36PM +0200, Uros Bizjak wrote:
> > I think it is better to leave the operation in its natural mode and
> > leave the peephole pass to do its magic, depending on the target.
>
> So like this?
You can use ix86_ex
On Sat, Jun 4, 2022 at 1:03 PM Roger Sayle wrote:
>
>
> By way of an apology for causing PR target/105791, where I'd overlooked
> the need to support V1TImode in TARGET_XOP's vpcmov instruction, this
> patch further improves support for TARGET_XOP's vpcmov instruction, by
> recognizing it in combi
On Sun, Jun 5, 2022 at 1:48 PM Roger Sayle wrote:
>
>
> Hi Uros,
> Many thanks for your speedy review. This revised patch implements
> all three of your recommended improvements; the use of
> ix86_binary_operator_ok with code UNKNOWN, the removal of
> "n" constraints from const_int_operand predic
On Sun, Jun 5, 2022 at 7:19 PM Roger Sayle wrote:
>
>
> This patch extends the recent and;cmp to not;test optimization to also
> perform this transformation for TImode on TARGET_64BIT and DImode on -m32,
> One motivation for this is that it's a step to fixing the current failure
> of gcc.target/i3
On Thu, Jun 2, 2022 at 5:04 PM Jan Beulich wrote:
>
> The 64-bit, 128-bit, and 512-bit variants have VDI return type, in
> line with instruction behavior. Make the 256-bit builtin match, thus
> also making it match the insn it expands to (using VI8_AVX2_AVX512BW).
>
> gcc/
>
> * config/i38
On Mon, Jun 6, 2022 at 10:23 AM Roger Sayle wrote:
>
>
> Hi Uros,
> > > The major theme of this patch is to generalize many of i386.md's
> > > *di3_doubleword patterns to become *_doubleword patterns, i.e.
> > > whenever there exists a "double word" optimization for DImode with
> > > -m32, there s
On Sun, Jun 5, 2022 at 7:19 PM Roger Sayle wrote:
>
>
> This patch extends the recent and;cmp to not;test optimization to also
> perform this transformation for TImode on TARGET_64BIT and DImode on -m32,
> One motivation for this is that it's a step to fixing the current failure
> of gcc.target/i3
On Sun, Jun 5, 2022 at 7:19 PM Roger Sayle wrote:
>
>
> This patch extends the recent and;cmp to not;test optimization to also
> perform this transformation for TImode on TARGET_64BIT and DImode on -m32,
> One motivation for this is that it's a step to fixing the current failure
> of gcc.target/i3
On Mon, Jun 6, 2022 at 1:28 PM Uros Bizjak wrote:
>
> On Sun, Jun 5, 2022 at 7:19 PM Roger Sayle wrote:
> >
> >
> > This patch extends the recent and;cmp to not;test optimization to also
> > perform this transformation for TImode on TARGET_64BIT and DImode on -m32,
> > One motivation for this is
On Tue, Jun 7, 2022 at 9:41 AM liuhongt wrote:
>
> So alternative v won't be igored in record_reg_classess.
>
> Similar for *r alternatives in some vector patterns.
>
> It helps testcase in the PR, also RA now makes better decisions for
> gcc.target/i386/extract-insert-combining.c
>
> movd
On Tue, Jun 7, 2022 at 6:56 AM liuhongt via Gcc-patches
wrote:
>
> 21114(define_insn_and_split "ssse3_palignrdi"
> 21115 [(set (match_operand:DI 0 "register_operand" "=y,x,Yv")
> 21116(unspec:DI [(match_operand:DI 1 "register_operand" "0,0,Yv")
> 21117(match_operand:DI
On Fri, Jun 10, 2022 at 8:28 PM H.J. Lu wrote:
>
> Since F16C and VAES are only usable with AVX, require AVX for F16C and
> VAES.
>
> OK for master and release branches?
>
> Thanks.
>
> H.J.
> ---
> libgcc/105920
> * common/config/i386/cpuinfo.h (get_available_features): Require
>
On Fri, Jun 10, 2022 at 9:27 PM Jakub Jelinek wrote:
>
> Hi!
>
> Another regression caused by my recent patch.
>
> This time because define_insn_and_split only requires that the
> constant mask is const_int_operand. When it was only SImode,
> that wasn't a problem, HImode neither, but for DImode
Under certain conditions register_operand predicate also allows
subregs of memory operands. When RTL checking is enabled, these
will fail with REGNO (op).
Allow subregs of memory operands, these are guaranteed
to be reloaded to a register.
2022-06-13 Uroš Bizjak
gcc/ChangeLog:
PR target
On Wed, Jun 15, 2022 at 12:49 AM liuhongt wrote:
>
> (In reply to Uroš Bizjak from comment #1)
> > Instruction does not accept memory operand for operand 3:
> >
> > (define_insn_and_split
> > "*_blendv_ltint"
> > [(set (match_operand: 0 "register_operand" "=Yr,*x,x")
> > (unspec:
> >
REGNO should not be used with register_operand before reload because
subregs of registers or even subregs of memory match the predicate.
The build with RTL checking enabled does not tolerate REGNO with
non-reg operand.
The patch splits the splitter into two related splitters and uses
(match_dup ...
The mode of pointer argument should equal ptr_mode, not Pmode.
2022-06-17 Uroš Bizjak
gcc/ChangeLog:
PR target/105970
* config/i386/i386.cc (ix86_function_arg): Assert that
the mode of pointer argumet is equal to ptr_mode, not Pmode.
gcc/testsuite/ChangeLog:
PR target/105970
This patch introduces alpha-specific version of store_data_bypass_p that
ignores TRAP_IF that would result in assertion failure (and internal
compiler error) in the generic store_data_bypass_p function.
While at it, also remove ev4_ist_c reservation, store_data_bypass_p
can handle the patterns wit
On Mon, Jun 20, 2022 at 9:27 AM liuhongt wrote:
>
> The patch is similar to [1], but use reg_or_subregno instead of REGNO.
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596804.html
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
>
On Mon, Jun 20, 2022 at 4:03 PM H.J. Lu wrote:
>
> On Tue, Jun 14, 2022 at 12:25 PM H.J. Lu wrote:
> >
> > Disallow siball when calling ifunc functions with PIC register so that
> > PIC register can be restored.
> >
> > gcc/
> >
> > PR target/105960
> > * config/i386/i386.cc (ix86
On Mon, Jun 20, 2022 at 10:04 AM Haochen Jiang wrote:
>
> From: "Jiang, Haochen"
>
> Hi all,
>
> We need syscall to enable AMX for kernels>=5.4. It is missing in current
> amx tests, which will cause test fail.
So this new code is only valid for linux & co?
Uros.
>
> This patch aims to add the
On Mon, Jun 20, 2022 at 8:14 PM H.J. Lu wrote:
>
> On Tue, May 10, 2022 at 9:25 AM H.J. Lu wrote:
> >
> > Mark a function with SYMBOL_FLAG_FUNCTION_ENDBR when inserting ENDBR at
> > function entry. Skip the 4-byte ENDBR when emitting a direct call/jmp
> > to a local function with ENDBR at functi
On Tue, Jun 21, 2022 at 4:23 AM Jiang, Haochen wrote:
>
> > -Original Message-
> > From: Uros Bizjak
> > Sent: Monday, June 20, 2022 10:54 PM
> > To: Jiang, Haochen
> > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao
> > Subject: Re: [PATCH] i386: Add syscall to enable AMX for latest kernels
On Tue, Jun 21, 2022 at 9:41 AM Jiang, Haochen wrote:
>
> > -Original Message-
> > From: Uros Bizjak
> > Sent: Tuesday, June 21, 2022 3:06 PM
> > To: Jiang, Haochen
> > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao
> > Subject: Re: [PATCH] i386: Add syscall to enable AMX for latest kernels
On Tue, Jun 21, 2022 at 4:46 PM H.J. Lu wrote:
>
> On Mon, Jun 20, 2022 at 7:51 AM Uros Bizjak wrote:
> >
> > On Mon, Jun 20, 2022 at 4:03 PM H.J. Lu wrote:
> > >
> > > On Tue, Jun 14, 2022 at 12:25 PM H.J. Lu wrote:
> > > >
> > > > Disallow siball when calling ifunc functions with PIC register
On Wed, Jun 22, 2022 at 1:39 PM Roger Sayle wrote:
>
>
> This patch addresses PR target/105930 which is an ia32 stack frame size
> regression in high-register pressure XOR-rich cryptography functions
> reported by Linus Torvalds. The underlying problem is once the limited
> number of registers on
On Fri, Jun 24, 2022 at 8:19 PM David Malcolm wrote:
>
> I'd like to ping this patch:
>https://gcc.gnu.org/pipermail/gcc-patches/2022-May/595440.html
>
> OK for trunk?
I have no idea what patch does, but if all other targets do the same,
x86 shouldn't be left behind.
So, rubber-stamping OK.
On Sun, Jun 26, 2022 at 1:12 PM Roger Sayle wrote:
>
>
> This patch is a follow-up improvement to my recent patch for
> PR rtl-optimization/7061. That patch added the test case
> gcc.target/i386/pr7061-2.c:
>
> float im(float _Complex a) { return __imag__ a; }
>
> For which GCC on x86_64 currentl
On Sun, Jun 26, 2022 at 2:04 PM Roger Sayle wrote:
>
>
> This patch addresses PR rtl-optimization/96692 on x86_64, by providing
> a define_split for combine to convert the three operation ((A|B)^C)^D
> into a two operation sequence using andn when either A or B is the same
> register as C or D. T
On Sun, Jun 26, 2022 at 5:54 PM Roger Sayle wrote:
>
>
> This patch was motivated by the investigation of Linus Torvalds' spill
> heavy cryptography kernels in PR 105930. The di3 expander
> handles all rotations by an immediate constant for 1..63 bits with the
> exception of 32 bits, which FAILs
On Tue, Jun 28, 2022 at 8:49 AM Jan Beulich wrote:
>
> When enabling AVX512FP via attribute or pragma, the _Float16 type would
> remain unavailable when at initialization time SSE2 wouldn't be seen as
> available for use. While this may hint at a wider underlying issue (like
> the feature, the typ
On Tue, Jun 28, 2022 at 8:48 AM Jan Beulich wrote:
>
> So far on 32-bit hosts this test failed (for both C and C++) because of
> the ABI change warning occurring without (explictly) enabling MMX.
>
> gcc/testsuite/
>
> * c-c++-common/torture/builtin-shufflevector-2.c: Prune ix86 MMX
>
On Tue, Jun 28, 2022 at 1:34 PM Roger Sayle wrote:
>
>
> Hi Uros,
> As you've requested/suggested, here's a patch that tidies up and
> unifies doubleword handling in i386.md; converting all doubleword
> splitters for logic operations to post-reload form, generalizing
> their define_insn_and_split
On Thu, Jun 30, 2022 at 7:59 AM Haochen Jiang wrote:
>
> Hi all,
>
> This patch aims to fix the cvtps2pd insn, which should also work on
> memory operand but currently does not. After this fix, when loop == 2,
> it will eliminate movq instruction.
>
> Regtested on x86_64-pc-linux-gnu. Ok for trunk
On Thu, Jun 30, 2022 at 9:24 AM Jiang, Haochen wrote:
>
> > -Original Message-
> > From: Uros Bizjak
> > Sent: Thursday, June 30, 2022 2:20 PM
> > To: Jiang, Haochen
> > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao
> > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory
> >
> > On Thu, Ju
On Thu, Jun 30, 2022 at 9:41 AM Uros Bizjak wrote:
>
> On Thu, Jun 30, 2022 at 9:24 AM Jiang, Haochen
> wrote:
> >
> > > -Original Message-
> > > From: Uros Bizjak
> > > Sent: Thursday, June 30, 2022 2:20 PM
> > > To: Jiang, Haochen
> > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao
> >
On Thu, Jun 30, 2022 at 10:45 AM Uros Bizjak wrote:
>
> On Thu, Jun 30, 2022 at 9:41 AM Uros Bizjak wrote:
> >
> > On Thu, Jun 30, 2022 at 9:24 AM Jiang, Haochen
> > wrote:
> > >
> > > > -Original Message-
> > > > From: Uros Bizjak
> > > > Sent: Thursday, June 30, 2022 2:20 PM
> > > >
On Thu, Jun 30, 2022 at 12:56 PM Roger Sayle wrote:
>
>
> Hi Uros,
> Many thanks for your review of the "double word logical operation clean-up"
> patch.
> The revision below incorporates the majority of your feedback, but with one
> or two
> exceptions (required to allow the patch to bootstrap)
On Thu, Mar 11, 2021 at 11:22 PM H.J. Lu wrote:
>
> Update 'P' operand modifier for -fno-plt to support inline assembly
> statements. In 64-bit, we can always load function address with
> @GOTPCREL. In 32-bit, we load function address with @GOT only for
> non-PIC since PIC register may not be av
On Thu, Mar 11, 2021 at 11:22 PM H.J. Lu wrote:
>
> Update 'P' operand modifier for -fno-plt to support inline assembly
> statements. In 64-bit, we can always load function address with
> @GOTPCREL. In 32-bit, we load function address with @GOT only for
> non-PIC since PIC register may not be av
On Fri, Mar 12, 2021 at 8:59 AM Jakub Jelinek wrote:
>
> Hi!
>
> This is the final patch of the series started with
> https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566139.html
> and continued with
> https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566356.html
> This time, I went through
On Fri, Mar 12, 2021 at 2:38 PM Jakub Jelinek wrote:
>
> On Fri, Mar 12, 2021 at 09:35:00AM +0100, Uros Bizjak via Gcc-patches wrote:
> > Perhaps we can introduce another Y... constraint for AVX512BW and use
> > it here. I think they will be used in other places, too.
>
>
On Fri, Mar 12, 2021 at 4:28 PM Jakub Jelinek wrote:
>
> On Fri, Mar 12, 2021 at 03:34:09PM +0100, Uros Bizjak wrote:
> > > (define_insn "*avx2_pmaddwd"
> > > - [(set (match_operand:V8SI 0 "register_operand" "=x,v")
> > > + [(set (match_operand:V8SI 0 "register_operand" "=Yw")
> >
> > I'm not s
On Fri, Mar 12, 2021 at 5:11 PM Uros Bizjak wrote:
>
> On Fri, Mar 12, 2021 at 4:28 PM Jakub Jelinek wrote:
> >
> > On Fri, Mar 12, 2021 at 03:34:09PM +0100, Uros Bizjak wrote:
> > > > (define_insn "*avx2_pmaddwd"
> > > > - [(set (match_operand:V8SI 0 "register_operand" "=x,v")
> > > > + [(set
901 - 1000 of 1175 matches
Mail list logo