This patch allows vectorization when operators are available as
libfuncs, rather that only as insns.
This will be useful for amdgcn where we plan to vectorize loops that
contain integer division or modulus, but don't want to generate inline
instructions for the division algorithm every time.
This patch is the next instalment in a set of backend patches around
improvements to ptest/vptest. A previous patch optimized the sequence
t=pand(x,y); ptestz(t,t) into the equivalent ptestz(x,y), using the
property that ZF is set to (X&Y) == 0. This patch performs a similar
transformation, conv
On 6/12/23 17:39, juzhe.zhong wrote:
I take this work which is very important for VLA SLP too. I will
support VLS after I finish VLA SLP.
OK. I think I'll mark Kito's patch as dropped and we'll wait for your
implementation in this space.
jeff
On 6/13/23 09:55, Andrew Stubbs wrote:
Subject:
[PATCH] vect: Vectorize via libfuncs
From:
Andrew Stubbs
Date:
6/13/23, 09:55
To:
"gcc-patches@gcc.gnu.org"
This patch allows vectorization when operators are available as
libfuncs, rather that only as insns.
This will be useful for amdgc
Hi!
As can be seen in the testcase, we don't diagnose #include/#include_next
of a non-existent header if __has_include/__has_include_next is done for
that header first.
The problem is that we normally error the first time some header is not
found, but in the _cpp_FFK_HAS_INCLUDE case obviously don
Hi!
I've noticed that standard_sse_constant_opcode emits some spurious
whitespace around tab, that isn't something which is done for
any other instruction and looks wrong.
Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk
as obvious.
2023-06-13 Jakub Jelinek
*
ping
> Hi,
>
> (It took me a while to get back to this.)
>
> This is a new and improved version of the patch at
> https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602932.html
> It addresses the comment from Joseph that FE_INVALID should really be tested
> in the case of both quiet and s
Spurred by Akari Takahashi's patch to config/sh/divtab.cc, this removes
divtab.cc completely.
divtab.cc was used to calculate a division table for the sh5 media
processor. GCC dropped support for that (unmanufactured) chip back in
2016 and this file simply got missed AFAICT.
Pushed to the
On 6/13/23 00:41, Jin Ma wrote:
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_compute_frame_info): Allocate frame for
FCSR.
(riscv_for_each_saved_reg): Save and restore FCSR in interrupt
functions.
* config/riscv/riscv.md (riscv_frcsr): New patterns.
(riscv_f
On Mon, Jun 12, 2023 at 11:12:45PM +0200, Harald Anlauf via Fortran wrote:
> Dear all,
>
> the attached - actually rather small - patch is the result of a
> rather intensive session with Mikael in an attempt to fix the
> situation that we did not create proper temporaries when passing
> zero-sized
On Linux/x86_64,
921b841350c4fc298d09f6c5674663e0f4208610 is the first bad commit
commit 921b841350c4fc298d09f6c5674663e0f4208610
Author: Kyrylo Tkachov
Date: Mon Jun 12 11:42:29 2023 +0100
simplify-rtx: Implement constant folding of SS_TRUNCATE, US_TRUNCATE
caused
FAIL: gcc.target/i386/
I happened to be digging into the specs to understand a build failure
and spotted mflib and mfwrap. Those were used by the mudflap system
which we ripped out years ago and we just missed these.
I verified x86 still bootstraps after removing these bits.
Pushed to the trunk as obvious,
Jeff
Hi!
On Tue, Jun 13, 2023 at 10:15:49AM +0800, Jiufu Guo wrote:
> David Edelsohn writes:
> >
> > This definitely seems to be a better solution.
> >
> > The TARGET_CONST_ANCHOR change should not be part of this patch. Also
> > there is no ChangeLog for the patch.
>
> Thanks a lot for your quick r
Hi!
As I said in a reply to the original patch: not okay. Sorry.
But some comments on this patch:
On Tue, Jun 13, 2023 at 08:23:35PM +0800, Jiufu Guo wrote:
> + && XINT (SET_SRC (set), 1) == UNSPEC_TIE
> + && XVECEXP (SET_SRC (set), 0, 0) == const0_rtx);
This makes it required that
I intent to commit this tomorrow, unless there are comments.
It does as it says (see commit log): It initializes default-device-var
to the value using the algorithm described in OpenMP 5.2, which
depends on whether OMP_TARGET_OFFLOAD=mandatory was set.
NOTE: With -foffload=disable there is no bi
First update for OpenMP changes that made it into GCC 14.
Wording, technical and other comments are welcome as always.
I intent to commit the attached patch tomorrow.
Tobias
PS: There were a bunch of other useful changes, but those "only" improved
and fixed features already supported or added
On Tue, 13 Jun 2023 10:41:00 PDT (-0700), gcc-patches@gcc.gnu.org wrote:
On 6/13/23 00:41, Jin Ma wrote:
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_compute_frame_info): Allocate frame for
FCSR.
(riscv_for_each_saved_reg): Save and restore FCSR in interrupt
functions.
On Tue, Jun 13, 2023 at 2:16 PM Segher Boessenkool
wrote:
>
> Hi!
>
> On Tue, Jun 13, 2023 at 10:15:49AM +0800, Jiufu Guo wrote:
> > David Edelsohn writes:
> > >
> > > This definitely seems to be a better solution.
> > >
> > > The TARGET_CONST_ANCHOR change should not be part of this patch. Also
Hi Jeff,
Thank you for your response. Regarding the divtab.cc file, I actually came
across it by accident while working on another task. I didn't have a
specific reason for investigating the file, but I noticed the issue and
thought it was worth bringing to your attention.
Thank you for taking ca
Hi Steve,
On 6/13/23 19:45, Steve Kargl via Gcc-patches wrote:
On Mon, Jun 12, 2023 at 11:12:45PM +0200, Harald Anlauf via Fortran wrote:
Dear all,
the attached - actually rather small - patch is the result of a
rather intensive session with Mikael in an attempt to fix the
situation that we di
I came across this when working on the conversion operator deduction fix. We'd
successfully demangle an instantiation of 'template operator X &
()', but fail for 'template operator X ()'. The demangle printer
was trying to specially handle the instantiation in the latter case -- seeing
the te
Quoting "How a computer should talk to people" (as quoted
in "Concepts Error Messages for Humans"):
"Various negative tones or actions are unfriendly: being manipulative,
not giving a second chance, talking down, using fashionable slang,
blaming. We must not seem to blame the person. We should avo
On Mon, Jun 12, 2023 at 11:16 PM Jan Beulich wrote:
> On 13.06.2023 05:28, Fangrui Song wrote:
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/large-data.c
> > @@ -0,0 +1,13 @@
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target lp64 } */
> > +/* { dg-options "-O2 -mcmode
The LA464 micro-architecture is sensitive to alignment of code. The
Loongson team has benchmarked various combinations of function, the
results [1] show that 16-byte label alignment together with 32-byte
function alignment gives best results in terms of SPEC score.
Add a mtune-based table-driven
From: Pan Li
This patch would like to fix one bug exported by RV32 test case
multiple_rgroup_run-2.c. The mask should be restricted by elen in
vector, and the condition between the vmv.s.x and the vmv.v.x should
take inner_bits_size rather than constants.
Passed both the rv32 and rv64 riscv/rvv
>> unsigned int elen = TARGET_VECTOR_ELEN_64 ? 64 : 32;
Add comment here to demonstrate why you pick up elen to set the LIMIT.
I understand:
1. -march=zve32* ===> ELEN = 32
-march=zve64* ===> ELEN = 64
2. both vmv.v.x/vmv.s.x is restrict to the ELEN
For example, When ELEN=32 (-march=zve32*)
Hi,
David Edelsohn writes:
> On Mon, Jun 12, 2023 at 11:30 PM Jiufu Guo wrote:
>>
>>
>> Hi David,
>>
>> David Edelsohn writes:
>> > On Wed, Jun 7, 2023 at 9:55 PM Jiufu Guo wrote:
>> >
>> > Hi,
>> >
>> > This patch checks if a constant is possible to be rotated to/from a
>> > positive
>>
Since there's no evex version for vpcmpeq ymm, ymm, ymm.
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ready to push to trunk and backport to GCC13.
gcc/ChangeLog:
PR target/110227
* config/i386/sse.md (mov_internal>): Use x instead of v
for alternative 2 sinc
Hi,
Xi Ruoyao writes:
> On Tue, 2023-06-13 at 20:23 +0800, Jiufu Guo via Gcc-patches wrote:
>
>> Compare with previous version, this addes ChangeLog and removes
>> const_anchor parts.
>> https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621356.html.
>
> [Off topic]
>
> const_anchor is just bro
This patch fixes an issue where if you use the -fstack-protector and
-mcpu=power10 options and you have a large stack frame, the GCC compiler will
generate a LWA instruction with a large offset.
Unlike the previous versions of this patch, I dug into it, and I found it was
much more complex that I
From: Pan Li
This patch is considered as the follow up of the below PATCH.
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621347.html
We aligned the predictor style for the define_insn_and_split suggested
by Kito. To avoid potential issues before we hit.
Signed-off-by: Pan Li
gcc/Change
LGTM.
juzhe.zh...@rivai.ai
From: pan2.li
Date: 2023-06-14 10:15
To: gcc-patches
CC: juzhe.zhong; rdapp.gcc; jeffreyalaw; pan2.li; yanzhang.wang; kito.cheng
Subject: [PATCH v1] RISC-V: Align the predictor style for define_insn_and_split
From: Pan Li
This patch is considered as the follow up
On 6/12/23 23:28, Jakub Jelinek via Libc-alpha wrote:
On Mon, Jun 12, 2023 at 09:51:02PM +, Joseph Myers wrote:
On Sat, 10 Jun 2023, Jakub Jelinek via Gcc-patches wrote:
I have looked at gnulib stdckdint.h and they are full of workarounds
for various compilers, EDG doesn't do this, clang <
Hi Segher, David,
David Edelsohn writes:
> On Tue, Jun 13, 2023 at 2:16 PM Segher Boessenkool
> wrote:
>>
>> Hi!
>>
>> On Tue, Jun 13, 2023 at 10:15:49AM +0800, Jiufu Guo wrote:
>> > David Edelsohn writes:
>> > >
>> > > This definitely seems to be a better solution.
>> > >
>> > > The TARGET_
Hi,
Segher Boessenkool writes:
> Hi!
>
> As I said in a reply to the original patch: not okay. Sorry.
Thanks a lot for your comments!
I'm also thinking about other solutions:
1. "set (mem/c:BLK (reg/f:DI 1 1) (const_int 0 [0])"
This is the existing pattern. It may be read as an action
t
From: Juzhe-Zhong
This patch is to optimize the permuation case that is suiteable use
merge approach.
Consider this following case:
typedef int8_t vnx16qi __attribute__((vector_size (16)));
#define MASK_16 0, 17, 2, 19, 4, 21, 6, 23, 8, 25, 10, 27, 12, 29, 14,
31
void __attribute__ ((
On 13/06/2023 00:22, Ken Matsui via Libstdc++ wrote:
This patch gets std::is_object to dispatch to new built-in traits,
__is_function, __is_reference, and __is_void.
libstdc++-v3/ChangeLog:
* include/std/type_traits (is_object): Use new built-in traits,
__is_function, __is_refe
On Tue, Jun 13, 2023 at 10:10 PM François Dumont wrote:
>
>
> On 13/06/2023 00:22, Ken Matsui via Libstdc++ wrote:
> > This patch gets std::is_object to dispatch to new built-in traits,
> > __is_function, __is_reference, and __is_void.
> >
> > libstdc++-v3/ChangeLog:
> > * include/std/type_t
Like is already the case for the AVX/AVX2 form, VMOVDDUP - acting on
double precision floating values - is more appropriate to use here, and
it can also result in shorter insn encodings when source is memory or
%xmm0...%xmm7, and no masking is applied (in allowing a 2-byte VEX
prefix then instead o
gcc/
* config/i386/constraints.md: Mention k and r for B.
--- a/gcc/config/i386/constraints.md
+++ b/gcc/config/i386/constraints.md
@@ -162,7 +162,9 @@
;; g GOT memory operand.
;; m Vector memory operand
;; c Constant memory operand
+;; k TLS address that allows insn using non-
... in vec_dupv4sf / *vec_dupv4si. The respective broadcast insns are
never longer (yet sometimes shorter) than the corresponding VSHUFPS /
VPSHUFD, due to the immediate operand of the shuffle insns balancing the
need for VEX3 in the broadcast ones. When EVEX encoding is required the
broadcast insn
There's no reason to constrain this to AVX512VL, as the wider operation
is not usable for more narrow operands only when the possible memory
source is a non-broadcast one. This way even the scalar copysign3
can benefit from the operation being a single-insn one (leaving aside
moves which the compil
Thanks Juzhe, Just passed the RV64 riscv/rvv.exp but meet some failures in RV32
the same as upstream. However, this patch may not introduce new failures but I
am not quite sure if there is risk here.
lowlist `find build-gcc-newlib-stage2/gcc/testsuite/ -name *.sum |paste -sd ","
-`
Hi Pan,
these failures were present before the patch I suppose? They
don't look related. Is this what you meant by "the same as upstream"?
> FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/full-vec-move1.c -std=c99 -O3
> -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax (test for
> ex
On Wed, Jun 14, 2023 at 1:55 PM Jan Beulich via Gcc-patches
wrote:
>
> Like is already the case for the AVX/AVX2 form, VMOVDDUP - acting on
> double precision floating values - is more appropriate to use here, and
> it can also result in shorter insn encodings when source is memory or
> %xmm0...%x
All failures with (test for excess errors) are not big issues which are created
by testcase themselves.
For example:
FAIL: g++.target/riscv/rvv/base/bug-14.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-9.C (test for excess errors)
These 2 failures are because RV32 doesn't have in
On Wed, Jun 14, 2023 at 1:56 PM Jan Beulich via Gcc-patches
wrote:
>
> gcc/
>
> * config/i386/constraints.md: Mention k and r for B.
Ok.
>
> --- a/gcc/config/i386/constraints.md
> +++ b/gcc/config/i386/constraints.md
> @@ -162,7 +162,9 @@
> ;; g GOT memory operand.
> ;; m Vector memo
Thanks Juzhe for explanation, that make more sense to me and sorry for
disturbing.
Pan
From: juzhe.zh...@rivai.ai
Sent: Wednesday, June 14, 2023 2:31 PM
To: Robin Dapp ; Li, Pan2 ; gcc-patches
Cc: Robin Dapp ; jeffreyalaw ;
Wang, Yanzhang ; kito.cheng
Subject: Re: Re: [PATCH v1] RISC-V: Ali
> I don't have a proper sim environment setup yet. How long does the
> testsuite take
> with spike? Have you tried qemu as well?
Any numbers on this Pan? How many cores do you use for running the testsuite?
Regards
Robin
On Tue, Jun 13, 2023 at 07:54:25PM -0700, Paul Eggert wrote:
> I don't see how you could implement __has_include_next() for
> arbitrary non-GCC compilers, which is what we'd need for glibc users. For
> glibc internals we can use "#include_next" more readily, since we assume a
> new-enough GCC. I.e.
> Any numbers on this Pan? How many cores do you use for running the testsuite?
Sorry for missing this part. It takes about 4-6 minutes with spike and 16 cores.
Pan
-Original Message-
From: Robin Dapp
Sent: Wednesday, June 14, 2023 2:47 PM
To: Li, Pan2 ; juzhe.zh...@rivai.ai; gcc-patche
Yes, I agree with the general assessment (and didn't mean to insinuate
that the FAILs are compiler's or a fault of the patch.
> So these 2 failures in RV32 are not the compile's bugs. I have seen:
> /* { dg-do run { target { { {riscv_vector} && {rv64} } } } } */ in
> these testcases which can not
On 25/02/23 3:20 pm, Ajit Agarwal via Gcc-patches wrote:
> Hello All:
>
> Here is the patch that uses xxlor instead of fmr where possible.
> Performance results shows that fmr is better in power9 and
> power10 architectures whereas xxlor is better in power7 and
> power 8 architectures. fmr is
101 - 153 of 153 matches
Mail list logo