Hi Lehua,
thanks, LGTM.
One thing maybe for the next patches: It seems to me that we lump all of
the COND_... tests into the cond subdirectory when IMHO they would also
fit into the respective directories of their operations (binop, unop etc).
Right now we will have a lot of rather unrelated tes
On Thu, Aug 24, 2023 at 9:16 PM Andrew Pinski via Gcc-patches
wrote:
>
> Now that MIN/MAX can sometimes be transformed into BIT_AND/BIT_IOR,
> we should allow BIT_AND and BIT_IOR in the early phiopt.
> Also we produce BIT_AND/BIT_IOR for things like `bool0 ? bool1 : 0`
> which seems like a good th
Hi,
This patch implements 32bit inline lrint by "fctiw". It depends on
the patch1 to do SImode move from FP register on P7.
Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
Thanks
Gui Haochen
ChangeLog
rs6000: support 32bit inline lrint
gcc/
PR target/88558
Hi,
This patch enables SImode in FP register on P7. Instruction "fctiw"
stores its integer output in an FP register. So SImode in FP register
needs be enabled on P7 if we want support "fctiw" on P7.
The test case is in the second patch which implements 32bit inline
lrint.
Bootstrapped and t
On Thu, Aug 24, 2023 at 9:16 PM Andrew Pinski via Gcc-patches
wrote:
>
> In PR 106677, I noticed that on the trunk we were producing:
> ```
> _25 = SR.116_117 == 0;
> _27 = (unsigned char) _25;
> _32 = _27 | SR.116_117;
> ```
> From `SR.115_117 != 0 ? SR.115_117 : 1`
> Rather than:
> ```
>
On Thu, Aug 24, 2023 at 9:16 PM Andrew Pinski via Gcc-patches
wrote:
>
> Even though this is handled by other code inside both VRP and CCP,
> sometimes we want to optimize this outside of VRP and CCP.
> An example is given in PR 106677 where phiopt will happen
> after VRP (which removes a cast for
在 2023/8/25 下午12:16, WANG Xuerui 写道:
On 8/25/23 12:01, Lulu Cheng wrote:
Since the slt instruction does not distinguish between 32-bit and
64-bit operations
under the LoongArch 64-bit architecture, if the operands of slt are
of SImode, symbol
expansion is required before operation.
Hint:“符号扩
Hi!
I'd like to ping this patch for acknowledgement from the Ada team.
We have successfully compiled a cross-native toolchain with Ada enabled
for loongarch64-linux-gnuf64 (or loongarch64-linux-gnu), and have run the
regtests with the following results:
While the failures are being worked on, we
Hoist want_to_gcse_p () calls rtx_cost () to compute max distance for
hoist candidates. For a simple const (say 6 which needs seperate insn "LI 6")
backend currently returns 0, causing Hoist to bail and elide GCSE.
Note that constants requiring more than 1 insns to setup were working
fine since ri
Hi Jivan,
On 8/24/23 08:45, Jivan Hakobyan via Gcc-patches wrote:
This patch fixes failing stack_save_restore_1/2 test cases.
After 6619b3d4c15c commit size of the frame was changed.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/stack_save_restore_1.c: Update frame size
* gcc.t
gcc/ChangeLog:
* ada/Makefile.rtl: Add LoongArch support.
* ada/libgnarl/s-linux__loongarch.ads: New.
* ada/libgnat/system-linux-loongarch.ads: New.
* config/loongarch/loongarch.h: mark normalized options
passed from driver to gnat1 as explicit for multilib.
vpmaskmov{d,q} is available for TARGET_AVX2, vmaskmov{ps,ps} is
available for TARGET_AVX, w/o TARGET_AVX2, we can use vmaskmov{ps,pd}
for VI48_128_256
Bootstrapped and regtested on x86_64-pc-linux{-m32,}.
Ready push to trunk.
gcc/ChangeLog:
PR target/19
* config/i386/sse.md (
On 8/25/23 12:01, Lulu Cheng wrote:
Since the slt instruction does not distinguish between 32-bit and 64-bit
operations
under the LoongArch 64-bit architecture, if the operands of slt are of SImode,
symbol
expansion is required before operation.
Hint:“符号扩展” is "sign extension" (as noun) or "sig
Since the slt instruction does not distinguish between 32-bit and 64-bit
operations
under the LoongArch 64-bit architecture, if the operands of slt are of SImode,
symbol
expansion is required before operation.
But similar to the following test case, symbol expansion can be omitted:
exte
On 8/24/23 12:56 AM, Kewen.Lin wrote:
> By looking into the uses of function rs6000_pcrel_p, I think we can
> just replace it with TARGET_PCREL. Previously we don't require PCREL
> unset for any unsupported target/OS, so we need rs6000_pcrel_p() to
> ensure it's really supported in those use place
This patch refactors the Phase 3 (Demand fusion) and rename it into Earliest
fusion.
I do the refactor for the following reasons:
1. Current implementation of phase 3 is doing too many things which makes the
code quality
quite messy and not easy to maintain.
2. The demand fusion I do
lgtm
On Fri, Aug 25, 2023 at 9:49 AM Pan Li via Gcc-patches
wrote:
>
> From: Pan Li
>
> There will be a case like below for intrinsic and autovec combination.
>
> vfadd RTZ <- intrinisc static rounding
> vfnmadd <- autovec/autovec-opt
>
> The autovec generated vfnmadd should take DYN mode,
On 8/24/23 12:35 PM, Michael Meissner wrote:
> On Thu, Jul 20, 2023 at 10:05:28AM +0530, jeevitha wrote:
>> gcc/
>> PR target/110411
>> * config/rs6000/rs6000.h (enum rs6000_builtin_type_index): Add fields
>> to hold PTImode type.
>> * config/rs6000/rs6000-builtin.cc (rs6000_ini
V2 changes: Address comments from Robin.
Hi,
This patch adds conditional sign/zero extension and truncation autovec
patterns by combining convert and vcond_mask patterns.
For quad truncation, two vncvt instructions are generated. This patch
combine the second vncvt and vmerge to form a masked vn
Committed.
gcc/ChangeLog:
* config/riscv/riscv-vsetvl.cc (pass_vsetvl::compute_local_properties):
Add early continue.
---
gcc/config/riscv/riscv-vsetvl.cc | 2 ++
1 file changed, 2 insertions(+)
diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index f75
From: Pan Li
There will be a case like below for intrinsic and autovec combination.
vfadd RTZ <- intrinisc static rounding
vfnmadd <- autovec/autovec-opt
The autovec generated vfnmadd should take DYN mode, and the
frm must be restored before the vfnmadd insn. This patch
would like to fix
Thanks Kito, will commit it after VFMADD, VFMSAC.
Pan
-Original Message-
From: Kito Cheng
Sent: Thursday, August 24, 2023 10:24 PM
To: Li, Pan2
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang
Subject: Re: [PATCH v1] RISC-V: Support rounding mode for VFNMSAC/VFNMSUB
> Date: Wed, 23 Aug 2023 11:10:02 +0200
> From: Jan Hubicka via Gcc-patches
> Hi,
> this patch adds missing profile update to maybe_optimize_range_tests.
[...]
> Jakub, it seems that the code is originally yours. Any idea why those are
> not turned to
> constant true or false conditionals?
>
On 8/24/23 11:28, Palmer Dabbelt wrote:
Reviewed-by: Palmer Dabbelt
I think Joern is still looking into fixing up all these explicit ISA
strings in the tests, but I don't see any reason to block fixing
failing tests on that.
Thanks!
Committed
Patrick
On 8/24/23 2:28 PM, Harald Anlauf via Fortran wrote:
Dear all,
the attached patch adds stricter bounds-checking for DATA statements
with implied-do. I chose to allow overindexing (for arrays of rank
greater than 1) for -std=legacy, as there might be codes in the wild
that need this (and this is
On Thu, Aug 24, 2023 at 5:05 PM Hongyu Wang via Gcc-patches
wrote:
>
> Hi,
>
> For PR27, the wrong code was caused by wrong expander for maskz.
> correct the parameter order for avx512ne2ps2bf16_maskz expander
>
> Bootstrapped/regtested on x86-64-pc-linux-gnu{m32,}.
> OK for master and backpor
> From: benjamin priour
>
> Hi,
>
> Below the first batch of a serie of patches to transition
> the analyzer testsuite from gcc.dg/analyzer to c-c++-common/analyzer.
> I do not know how long this serie will be, thus the patch was not
> numbered.
>
> For the grand majority of the tests, the tran
Dear all,
the attached patch adds stricter bounds-checking for DATA statements
with implied-do. I chose to allow overindexing (for arrays of rank
greater than 1) for -std=legacy, as there might be codes in the wild
that need this (and this is accepted by some other compilers, while
NAG is strict
Related Discussion:
https://inbox.sourceware.org/gcc-patches/12fb5088-3f28-0a69-de1e-f387371a5...@gmail.com/
This patch updates the sync instructions to ensure that no insn is left
without a type attribute. Updates a total of 6 insns to have type "atomic"
Tested for regressions using rv32/64 mult
Add new pattern involving vec_merge RTX that is produced by combine from the
combination of sse4_1_pinsrq and *movdi_internal:
7: r86:DI=0
8: r85:V2DI=vec_merge(vec_duplicate(r86:DI),r87:V2DI,0x2)
REG_DEAD r87:V2DI
REG_DEAD r86:DI
Successfully matched this instruction:
(set (re
GCC maintainers:
Version 3, fixed the built-in instance names. Missed removing the "n"
the name. Added the tighter constraints on the predicates for the
define_insn. Updated the wording for the built-ins in the
documentation file. Changed the test file name again. Updated the
ChangeLog file,
Kewen, Peter:
> on 2023/8/17 08:19, Carl Love wrote:
> > GCC maintainers:
> >
> > Version 2, renamed the built-in instances. Changed the name of the
> > overloaded built-in. Added the missing documentation for the new
> > built-ins. Fixed typos. Changed name of the test. Updated the
> > effe
Now that MIN/MAX can sometimes be transformed into BIT_AND/BIT_IOR,
we should allow BIT_AND and BIT_IOR in the early phiopt.
Also we produce BIT_AND/BIT_IOR for things like `bool0 ? bool1 : 0`
which seems like a good thing to allow early on too.
OK? Bootstrapped and tested on x86_64-linux-gnu with
In PR 106677, I noticed that on the trunk we were producing:
```
_25 = SR.116_117 == 0;
_27 = (unsigned char) _25;
_32 = _27 | SR.116_117;
```
>From `SR.115_117 != 0 ? SR.115_117 : 1`
Rather than:
```
_119 = MAX_EXPR <1, SR.115_117>;
```
Or (rather)
```
_119 = SR.115_117 | 1;
```
Due to t
Even though this is handled by other code inside both VRP and CCP,
sometimes we want to optimize this outside of VRP and CCP.
An example is given in PR 106677 where phiopt will happen
after VRP (which removes a cast for a comparison) and then
phiopt will optimize the phi to be `a | 1` which can the
On Thu, 24 Aug 2023 11:04:59 PDT (-0700), Patrick O'Neill wrote:
Resolves failures like this on rv32gcv linux:
compiler exited with status 1
output is:
In file included from
/tc-baseline/build-linux-gcv/sysroot/usr/include/features.h:515,
from
/tc-baseline/build-linux-gcv/sysro
Resolves failures like this on rv32gcv linux:
compiler exited with status 1
output is:
In file included from
/tc-baseline/build-linux-gcv/sysroot/usr/include/features.h:515,
from
/tc-baseline/build-linux-gcv/sysroot/usr/include/bits/libc-header-start.h:33,
from
On Thu, Jul 20, 2023 at 10:05:28AM +0530, jeevitha wrote:
> Hi All,
>
> The following patch has been bootstrapped and regtested on powerpc64le-linux.
>
> When the user specifies PTImode as an attribute, it breaks. Created
> a tree node to handle PTImode types. PTImode attribute helps in generatin
I've now prepared the patch to support following config:
--disable-libstdcxx-dual-abi --with-default-libstdcxx-abi=new
and so detected yet another problem with src/c++98/compatibility.cc. We
need basic_istream<>::ignore(streamsize) definitions that rely here but
not the rest of it.
François
>
>- Phase 3 - Backward && forward demanded info propagation and fusion
> across
> blocks.
>
Need update comment here.
>- Phase 6 - Propagate AVL between vsetvl instructions.
Need update comment here too.
> +/* Return true if the current VSETVL is dominated by preceding VSETVL.
On Wed, Aug 23, 2023 at 04:23:00PM -0400, Jason Merrill wrote:
> I'd be surprised if this would affect any real code, but I suppose so. In
> any case I'd like to fix this at the same time as the local statics, to
> avoid changing their mangled name twice.
Ok.
Running now into a problem with abi ta
On 22.08.23 15:37, Jakub Jelinek wrote:
On Sun, Jul 23, 2023 at 04:15:21PM -0600, Sandra Loosemore wrote:
[...]
In the Fortran front end, most of the semantic processing happens during
the translation phase, so the parse phase just collects the intervening
statements, checks them for errors, and
On Thu, Aug 24, 2023 at 05:47:09PM +0200, Richard Biener via Gcc-patches wrote:
> > Do you think that the pass is worthy of inclusion into upstream GCC? What
> > are
> > some things that I should change? Should I try to put the pass in different
> > places in passes.def?
>
> The most obvious plac
On Thu, 24 Aug 2023 at 08:27, Thiago Jung Bauermann
wrote:
>
> Since commit e7a36e4715c7 "[PATCH] RISC-V: Support simplify (-1-x) for
> vector." these tests fail on aarch64-linux:
>
> === g++ tests ===
>
> Running g++:g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp ...
> FAIL:
> Am 24.08.2023 um 17:07 schrieb Filip Kastl :
>
> Hi,
>
> As a part of my bachelor thesis under the supervision of Honza (Jan Hubicka),
> I
> implemented a new PHI elimination algorithm into GCC. The algorithm is
> described in the following article:
>
> Braun, M., Buchwald, S., Hack, S.,
This patch fixes failing stack_save_restore_1/2 test cases.
After 6619b3d4c15c commit size of the frame was changed.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/stack_save_restore_1.c: Update frame size
* gcc.target/riscv/stack_save_restore_2.c: Likewise.
--
With the best regar
> On Thu, Aug 24, 2023 at 3:15 PM Jan Hubicka via Gcc-patches
> wrote:
> >
> > Hi,
> > this patch extends verifier to check that all probabilities and counts are
> > initialized if profile is supposed to be present. This is a bit complicated
> > by the posibility that we inline !flag_guess_branch
Hi Marek.
> On Thu, Aug 17, 2023 at 05:37:03PM +0200, Jose E. Marchesi via Gcc-patches
> wrote:
>>
>> > On Thu, 17 Aug 2023, Jose E. Marchesi via Gcc-patches wrote:
>> >
>> >> +@opindex Wcompare-distinct-pointer-types
>> >> +@item -Wcompare-distinct-pointer-types
>> >
>> > This @item should sa
Hi,
As a part of my bachelor thesis under the supervision of Honza (Jan Hubicka), I
implemented a new PHI elimination algorithm into GCC. The algorithm is
described in the following article:
Braun, M., Buchwald, S., Hack, S., Leißa, R., Mallon, C., Zwinkau, A.
(2013). Simple and Efficient Constru
gcc/analyzer/ChangeLog:
PR analyzer/105899
* region-model.cc (fragment::has_null_terminator): Handle
SK_BITS_WITHIN.
---
gcc/analyzer/region-model.cc | 21 -
1 file changed, 20 insertions(+), 1 deletion(-)
diff --git a/gcc/analyzer/region-model.cc b/gcc
gcc/analyzer/ChangeLog:
PR analyzer/105899
* region-model-manager.cc
(region_model_manager::get_or_create_initial_value): Simplify
INIT_VAL(ELEMENT_REG(STRING_REG), CONSTANT_SVAL) to
CONSTANT_SVAL(STRING[N]).
---
gcc/analyzer/region-model-manager.cc | 19 +++
This patch kit makes improvements to the analyzer's new strlen
implementation, and wires it up to strcpy and strcat.
For example, given:
#include
void test (void)
{
char buf[10];
strcpy (buf, "hello world!");
}
we now emit:
demo.c: In function ‘test’:
demo.c:6:3: warning: stac
gcc/analyzer/ChangeLog:
PR analyzer/105899
* region-model.cc (fragment::has_null_terminator): Move STRING_CST
handling to fragment::string_cst_has_null_terminator; also use it to
handle INIT_VAL(STRING_REG).
(fragment::string_cst_has_null_terminator): New, fr
gcc/analyzer/ChangeLog:
PR analyzer/105899
* region-model.cc (iterable_cluster::iterable_cluster): Add
symbolic binding keys to m_symbolic_bindings.
(iterable_cluster::has_symbolic_bindings_p): New.
(iterable_cluster::m_symbolic_bindings): New field.
gcc/analyzer/ChangeLog:
PR analyzer/105899
* call-details.cc
(call_details::check_for_null_terminated_string_arg): Split into
overloads, one taking just an arg_idx, the other a new
"include_terminator" param.
* call-details.h: Likewise.
* kf.c
This patch reimplements the analyzer's implementation of strcpy using
the region_model::scan_for_null_terminator infrastructure, so that e.g.
it can complain about out-of-bounds reads/writes, unterminated strings,
etc.
gcc/analyzer/ChangeLog:
PR analyzer/105899
* kf.cc (kf_strcpy::
gcc/analyzer/ChangeLog:
* kf.cc (kf_memcpy_memmove::impl_call_pre): Reimplement using
region_model::copy_bytes.
* region-model.cc (region_model::read_bytes): New.
(region_model::copy_bytes): New.
* region-model.h (region_model::read_bytes): New decl.
gcc/analyzer/ChangeLog:
PR analyzer/105899
* region-model.cc (region_model::get_string_size): Delete both.
* region-model.h (region_model::get_string_size): Delete both
decls.
---
gcc/analyzer/region-model.cc | 29 -
gcc/analyzer/region-m
gcc/analyzer/ChangeLog:
* engine.cc (impl_path_context::impl_path_context): Add logger
param.
(impl_path_context::bifurcate): Add log message.
(impl_path_context::terminate_path): Likewise.
(impl_path_context::m_logger): New field.
(exploded_graph::pr
Hi!
The following patch on top of PR110349 patch (weak dependency,
only for -Wc++26-extensions, I could split that part into an independent
patch) and PR110342 patch (again weak dependency, this time mainly
because it touches the same code in cp_parser_static_assert and
nearby spot in udlit-error1
LGTM
On Thu, Aug 24, 2023 at 5:35 PM Pan Li via Gcc-patches
wrote:
>
> From: Pan Li
>
> There will be a case like below for intrinsic and autovec combination.
>
> vfadd RTZ <- intrinisc static rounding
> vfnmsub <- autovec/autovec-opt
>
> The autovec generated vfnmsub should take DYN mode,
Ping. I refined the code and some comments a bit and added a test
case.
My question in general would still be: Is this something we want
given that we potentially move some of combine's work a bit towards
the front of the RTL pipeline?
Regards
Robin
Subject: [PATCH] fwprop: Allow UNARY_P and
Hi!
The following patch implements C++26 unevaluated-string.
As it seems to me just extra pedanticity, it is implemented only for
-std=c++26 or -std=gnu++26 and later and only if -pedantic/-pedantic-errors.
Nothing is done for inline asm, while the spec changes those, it changes it
to a balanced t
On 8/23/23 18:42, Vineet Gupta wrote:
Seriously, I detest it too, but the irony is I've now made my 2nd change
in there and keep adding to ugliness :-(
Happens to all of us sometimes.
So I think your change makes sense. But I think it can be refined to
simplify the larger chunk of
On Wed, 23 Aug 2023, Jason Merrill wrote:
> On 8/21/23 21:51, Patrick Palka wrote:
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look like
> > a reasonable approach? I didn't observe any compile time/memory impact
> > of this change.
> >
> > -- >8 --
> >
> > As described in d
On Thu, Aug 24, 2023 at 3:15 PM Jan Hubicka via Gcc-patches
wrote:
>
> Hi,
> this patch extends verifier to check that all probabilities and counts are
> initialized if profile is supposed to be present. This is a bit complicated
> by the posibility that we inline !flag_guess_branch_probability f
Hi,
this patch extends verifier to check that all probabilities and counts are
initialized if profile is supposed to be present. This is a bit complicated
by the posibility that we inline !flag_guess_branch_probability function
into function with profile defined and in this case we need to stop
ve
From: Paul Dreik
Tested x86_64-linux. Pushed to trunk.
-- >8 --
libstdc++-v3/ChangeLog:
PR libstdc++/02
* testsuite/std/format/string.cc: Check wide character format
strings with out-of-range widths.
---
libstdc++-v3/testsuite/std/format/string.cc | 15
On Wed, 23 Aug 2023 at 19:48, Paul Dreik via Libstdc++
wrote:
>
> This fixes pointer arithmetic made on a null pointer, which I found
> through fuzzing.
> Tested on debian/amd64.
>
> Thanks, Paul
Thanks. Pushed to trunk, backport to gcc-13 to follow.
I also added your testcase from the bug repor
Tested x86_64-linux. Pushed to trunk.
-- >8 --
Update a preprocessor condition using __cplusplus and _GLIBCXX_HOSTED
to use the relevant feature test macro for .
Also add comments to some conditions saying which C++ standard revision
the check corresponds to.
libstdc++-v3/ChangeLog:
*
Tested x86_64-linux. Pushed to trunk.
-- >8 --
libstdc++-v3/ChangeLog:
* testsuite/std/format/functions/format_to.cc: Avoid warning for
unused variables.
---
libstdc++-v3/testsuite/std/format/functions/format_to.cc | 8
1 file changed, 4 insertions(+), 4 deletions(-)
d
Tested x86_64-linux. Pushed to trunk.
-- >8 --
This is a no-op for libstdc++, because our intmax_t is a 64-bit type and
so is incapable of representing the largest and smallest ratios from
C++11, let alone the new ones. I've added them to the file anyway (and
defined the feature test macro) so th
On Thu, 24 Aug 2023, Robin Dapp wrote:
> This causes an ICE in
> gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c
> (internal compiler error: in get_group_load_store_type, at
> tree-vect-stmts.cc:2121)
>
> #include
>
> #define TEST_LOOP(DATA_TYPE, INDEX_TYPE)
Tested x86_64-linux. Pushed to trunk. Maybe worth backporting.
-- >8 --
Print the locale's name, except when it uses the same named C locale for
all categories except one, in which case print something like:
std::locale = "en_GB.UTF-8" with "LC_CTYPE=en_US.UTF-8"
libstdc++-v3/ChangeLog:
load_p is set and used as to whether the stmt is a memory operation,
not whether it is only a load. The following renames it to ldst_p
to avoid this confusion. It also replaces checking for a VUSE
with checking STMT_VINFO_DATA_REF since VUSE checking doesn't
work for pattern matched stores where
Tested x86_64-linux. Pushed to trunk.
-- >8 --
As the PR says, including the template arguments in the GDB output of
these class templates can result in very long names, especially for
std::variant. You can use 'whatis' or other GDB commands to get details
of the type, we don't need to include it
This causes an ICE in
gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c
(internal compiler error: in get_group_load_store_type, at
tree-vect-stmts.cc:2121)
#include
#define TEST_LOOP(DATA_TYPE, INDEX_TYPE) \
void __attribute__ ((noinline,
On Tue, Aug 22, 2023 at 12:53:19PM -0600, Sandra Loosemore wrote:
> > All these c-c++-common testsuite changes will now FAIL after the C patch but
> > before the C++. It is nice to have the new c-c++-common tests in a separate
> > patch, but these tweaks which can't be just avoided need the tempor
Committed, thanks Robin.
Pan
-Original Message-
From: Gcc-patches On Behalf
Of Robin Dapp via Gcc-patches
Sent: Thursday, August 24, 2023 7:03 PM
To: 钟居哲 ; gcc-patches
Cc: rdapp@gmail.com; kito.cheng ; kito.cheng
; Jeff Law
Subject: Re: [PATCH] RISC-V: Add COND_LEN_FNMA/COND_LEN_
OK.
Regards
Robin
When a loop is marked with
#pragma GCC novector
the following makes sure to also skip BB vectorization for contained
blocks. That avoids gcc.dg/vect/bb-slp-29.c failing on aarch64
because of extra BB vectorization therein. I'm not specifically
dealing with sub-loops of novector loops, the des
Hi,
This patch adds the conditional autovec convert between INT and FP
by combine convert and vcond_mask patterns.
Best,
Lehua
gcc/ChangeLog:
* config/riscv/autovec-opt.md (*cond_):
New combine pattern.
(*cond_): Ditto.
(*cond_): Ditto.
(*cond_): Ditto.
Ping.
MIddle-end patch:
[PATCH V2] gimple_fold: Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple
fold (gnu.org)
has been approved and supported.
This patch is pending 8 days.
juzhe.zh...@rivai.ai
From: Juzhe-Zhong
Date: 2023-08-16 21:20
To: gcc-patches
CC: kito.cheng; kito.cheng; jef
Committed, thanks Richard.
Pan
-Original Message-
From: Gcc-patches On Behalf
Of Richard Sandiford via Gcc-patches
Sent: Thursday, August 24, 2023 6:34 PM
To: Juzhe-Zhong
Cc: gcc-patches@gcc.gnu.org; rguent...@suse.de
Subject: Re: [PATCH V2] gimple_fold: Support
COND_LEN_FNMA/COND_LEN
Juzhe-Zhong writes:
> Hi, Richard and Richi.
>
> Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math.
> It's supported in tree-ssa-math-opts.cc. However, GCC failed to support
> COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS.
>
> Consider this following case:
> #define TEST_TYPE(T
On 2023/8/24 18:20, Robin Dapp wrote:
Yes, it's better to call it one_quad.
I'd suggest to go with quarter as before or quarter_width_op
or something.
OK for quarter.
Is this necessary for recognizing a different pattern?
Are you saying that the testcases xxx-1 and xxx-2 are duplicate
LGTM.
Regards
Robin
> Yes, it's better to call it one_quad.
I'd suggest to go with quarter as before or quarter_width_op
or something.
>> Is this necessary for recognizing a different pattern?
>
> Are you saying that the testcases xxx-1 and xxx-2 are duplicated? If
> so, I have no problem removing it and just kee
Hi Robin,
On 2023/8/24 17:59, Robin Dapp wrote:
Hi Lehua,
thanks, just tiny non-functional nits.
- rtx ops[] = {operands[0], quarter};
- icode = code_for_pred_trunc (mode);
- riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, ops);
+ rtx half = gen_reg_rtx (mode);
Not really
Consider this following case:
int __attribute__ ((noinline, noclone))
condition_reduction (int *a, int min_v)
{
int last = 66; /* High start value. */
for (int i = 0; i < 4; i++)
if (a[i] < min_v)
last = i;
return last;
}
--param=riscv-autovec-preference=fixed-vlmax --param=risc
Hi Lehua,
thanks, just tiny non-functional nits.
> - rtx ops[] = {operands[0], quarter};
> - icode = code_for_pred_trunc (mode);
> - riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, ops);
> + rtx half = gen_reg_rtx (mode);
Not really a half anymore now? :)
> +#include
> +
> +#
Jeff Law writes:
> On 8/22/23 02:08, juzhe.zh...@rivai.ai wrote:
>> Yes, I agree long-term we want every-thing be optimized as early as
>> possible.
>>
>> However, IMHO, it's impossible we can support every conditional patterns
>> in the middle-end (match.pd).
>> It's a really big number.
>>
>
Richard Sandiford writes:
> Rather than hiding this in target code, perhaps we should add a
> target-independent concept of an "eh_return taken" flag, say
> EH_RETURN_TAKEN_RTX.
>
> We could define it so that, on targets that define EH_RETURN_TAKEN_RTX,
> a register EH_RETURN_STACKADJ_RTX and a re
>> The use_real_merge just appeared odd to me here because there is
>> nothing to merge. But in the end it's just to omit the vundef operand
>> so good for now. There is an increasing number of opportunities to
>> refactor in riscv-v.cc, though ;)
I think we can change use_real_merge into use_du
Richard Biener writes:
> The following adds the capability to do SLP on .MASK_STORE, I do not
> plan to add interleaving support.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?
LGTM, thanks.
Richard
> Thanks,
> Richard.
>
> PR tree-optimization/15
> gcc/
> * tree-v
We assume that all root stmts which compose the total reduction chain
are vectorized but fail to account for the cost of adding back the
scalar defs we are not vectorizing. The following rectifies this,
fixing the gcc.dg/tree-ssa/slsr-11.c FAIL on aarch64.
Bootstrapped and tested on x86_64-unknow
From: Pan Li
There will be a case like below for intrinsic and autovec combination.
vfadd RTZ <- intrinisc static rounding
vfnmsub <- autovec/autovec-opt
The autovec generated vfnmsub should take DYN mode, and the
frm must be restored before the vfnmsub insn. This patch
would like to fix
>> Why is that necessary? Just for the popcount I presume?
>> Can't we rather have a new case for a scalar destination? I find
>> the code a bit misleading now as we check m_dest_mode and then not
>> use it.
I am gonna fix it in V2.
>> The rest looks good to me. Note that my machine crashed w
The scalar FNMADD/FNMSUB and SVE FNMLA/FNMLS instructions mean
that either side of a subtraction can start an accumulator chain.
However, Advanced SIMD doesn't have an equivalent instruction.
This means that, for Advanced SIMD, a subtraction can only be
fused if the second operand is a multiplicati
Hi Juzhe,
> vcpop.m a5,v0
> beq a5,zero,.L3
> addia5,a5,-1
> vsetvli a4,zero,e32,m1,ta,ma
> vcompress.vmv2,v3,v0
> vslidedown.vx v2,v2,a5
> vmv.x.s a0,v2
> .L3:
> sext.w a0,a0
Mhm, where is this sext coming from? Thought I had this c
1 - 100 of 115 matches
Mail list logo