Hi,
Richard Biener writes:
> On Mon, 24 Jul 2023, Jiufu Guo wrote:
>
>>
>> Hi Martin,
>>
>> Not sure about your current option about re-using the ipa-sra code
>> in the light-expander-sra. And if anything I could input please
>> let me know.
>>
>> And I'm thinking about the difference betwe
From: Pan Li
This patch would like to support the rounding mode API for both the
VFSUB and VFRSUB as below samples.
* __riscv_vfsub_vv_f32m1_rm
* __riscv_vfsub_vv_f32m1_rm_m
* __riscv_vfsub_vf_f32m1_rm
* __riscv_vfsub_vf_f32m1_rm_m
* __riscv_vfrsub_vf_f32m1_rm
* __riscv_vfrsub_vf_f32m1_rm_m
Sig
>>> I'm not against continuing with the more well-known approach for now
>>> but we should keep in mind that might still be potential for improvement.
>
> No. I don't think it's faster.
I did a quick check on my x86 laptop and it's roughly 25% faster there.
That's consistent with the literature.
From: Ju-Zhe Zhong
Hi, Richard and Richi.
Base on the suggestions from Richard:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625396.html
This patch choose (1) approach that Richard provided, meaning:
RVV implements cond_* optabs as expanders. RVV therefore supports
both IFN_COND_ADD an
On 7/29/23 03:13, Xiao Zeng wrote:
This patch recognizes Zicond patterns when the select pattern
with condition eq or neq to 0 (using eq as an example), namely:
1 rd = (rs2 == 0) ? non-imm : 0
2 rd = (rs2 == 0) ? non-imm : non-imm
3 rd = (rs2 == 0) ? reg : non-imm
4 rd = (rs2 == 0) ? reg : re
> -Original Message-
> From: Jan Beulich
> Sent: Tuesday, August 1, 2023 1:49 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; Kirill Yukhin
>
> Subject: [PATCH] x86: fold two of vec_dupv2df's alternatives
>
> By using Yvm in the source, both can be expressed in one.
>
> gcc/
>
./multilib.am already specifies this same command, and make warns about
the earlier one being ignored when seeing the later one. All that needs
retaining to still satisfy the preceding comment is the extra
dependency.
libatomic/
* Makefile.am (all-multi): Drop commands.
* Makefile
By using Yvm in the source, both can be expressed in one.
gcc/
* sse.md (vec_dupv2df): Fold the middle two of the
alternatives.
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -13784,21 +13784,20 @@
(set_attr "mode" "DF,DF,V1DF,V1DF,V1DF,V2DF,V1DF,V1DF,V1DF")])
After
b9d7140c80bd3c7355b8291bb46f0895dcd8c3cb is the first bad commit
commit b9d7140c80bd3c7355b8291bb46f0895dcd8c3cb
Author: Jan Hubicka
Date: Fri Jul 28 09:16:09 2023 +0200
loop-split improvements, part 1
Now we have
vpbroadcastd %ecx, %xmm0
vpaddd .LC3(%rip), %xmm0, %xmm0
v
Ping
On 21/07/23 3:43 pm, Surya Kumari Jangala via Gcc-patches wrote:
> The improve_allocation() routine does not update the
> allocated_hardreg_p[] array after an allocno is assigned a register.
>
> If the register chosen in improve_allocation() is one that already has
> been assigned to a confl
On 7/17/23 15:28, Patrick O'Neill wrote:
The RISC-V Ztso extension currently has no effect on generated code.
With the additional ordering constraints guarenteed by Ztso, we can emit
more optimized atomic mappings than the RVWMO mappings.
This PR defines a strengthened version of Andrea Parri
On Sat, Jul 29, 2023 at 11:55 AM haochen.jiang via Gcc-regression
wrote:
>
> On Linux/x86_64,
>
> b9d7140c80bd3c7355b8291bb46f0895dcd8c3cb is the first bad commit
> commit b9d7140c80bd3c7355b8291bb46f0895dcd8c3cb
> Author: Jan Hubicka
> Date: Fri Jul 28 09:16:09 2023 +0200
>
> loop-split im
`#pragma GCC target' is not currently handled in preprocess-only mode (e.g.,
when running gcc -E or gcc -save-temps). As noted in the PR, this means that
if the target pragma defines any macros, those macros are not effective in
preprocess-only mode. Similarly, such macros are not effective when
co
Tested on x86_64-pc-linux-gnu, does this look OK for trunk?
-- >8 --
gcc/cp/ChangeLog:
* ptree.cc (cxx_print_decl): Check for DECL_LANG_SPECIFIC and
TS_DECL_COMMON only when necessary. Print DECL_TEMPLATE_INFO
for all decls that have it, not just VAR_DECL or FUNCTION_DEC
Tested on x86_64-pc-linux-gnu, does this look OK for trunk?
-- >8 --
In the C++ front end, a COMPONENT_REF's second operand isn't always a
decl (at least at template parse time). This patch makes the generic
pretty printer not ICE when printing such a COMPONENT_REF.
gcc/ChangeLog:
* tr
In this case we are not removing convert to a bigger size
back to the same size (or smaller) if signedness does not
match.
For an example:
```
signed char _1;
...
_1 = *a_4(D);
b_5 = (short unsigned int) _1;
_2 = (unsigned char) b_5;
```
The inner cast is not needed and can be removed but w
This libbacktrace patch, based on one by Andres Freund, uses the
_pgmptr variable declared on Windows to find the executable file name
if none is specified. Bootstrapped and ran libbacktrace testsuite on
x86_64-pc-linux-gnu. Committed to mainline.
Ian
Patch from Andres Freund:
* configure.ac: C
On 7/31/23 15:43, Prathamesh Kulkarni via Gcc-patches wrote:
On Mon, 19 Jun 2023 at 19:59, Stefan Schulze Frielinghaus via
Gcc-patches wrote:
Comparisons between memory and constants might be done in a smaller mode
resulting in smaller constants which might finally end up as immediates
inst
On Mon, 2023-07-31 at 13:46 +0200, Benjamin Priour wrote:
> Hi Dave,
>
> On Fri, Jul 21, 2023 at 10:10 PM David Malcolm
> wrote:
[...snip...]
> >
> > I see that we have test coverage for:
> > noexcept-new.C: -fno-exceptions with new vs nothrow-new
> > whereas:
> > new-2.C has (implicitly)
On Fri, Jul 28, 2023 at 6:58 PM David Malcolm wrote:
>
> On Fri, 2023-07-21 at 19:08 -0400, Lewis Hyatt wrote:
> > Add a new linemap reason LC_GEN which enables encoding the location
> > of data
> > that was generated during compilation and does not appear in any
> > source file.
> > There could b
>> From my recollection this is usually 30-40% faster than the naive tree
>> adder and also amenable to vectorization. As long as the multiplication
>> is not terribly slow, that is. Mula's algorithm should be significantly
>> faster even, another 30% IIRC.
>> I'm not against continuing with th
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
-- >8 --
Now that cp_parser_constant_expression accepts a null non_constant_p,
we can transitively remove dummy arguments in the call chain.
Running dg.exp and counting the # of is_rvalue_constant_expression calls
from cp_parser_consta
On Tue, 1 Aug 2023 at 03:13, Prathamesh Kulkarni
wrote:
>
> On Mon, 19 Jun 2023 at 19:59, Stefan Schulze Frielinghaus via
> Gcc-patches wrote:
> >
> > Comparisons between memory and constants might be done in a smaller mode
> > resulting in smaller constants which might finally end up as immediat
On Mon, 19 Jun 2023 at 19:59, Stefan Schulze Frielinghaus via
Gcc-patches wrote:
>
> Comparisons between memory and constants might be done in a smaller mode
> resulting in smaller constants which might finally end up as immediates
> instead of in the literal pool.
>
> For example, on s390x a non-
On Fri, 21 Jul 2023 at 16:52, Martin Uecker via Gcc-patches
wrote:
>
>
>
> This patch adds a warning for allocations with insufficient size
> based on the "alloc_size" attribute and the type of the pointer
> the result is assigned to. While it is theoretically legal to
> assign to the wrong pointe
> +/* FIXME: We don't allow vectorize "__builtin_popcountll" yet since it needs
> "vec_pack_trunc" support
> + and such pattern may cause inferior codegen.
> + We will enable "vec_pack_trunc" when we support reasonable vector
> cost model. */
Wait, why do we need vec_pack_trunc f
In some cases (usually dealing with bools only), there could be some statements
left behind which are considered trivial dead.
An example is:
```
bool f(bool a, bool b)
{
if (!a && !b)
return 0;
if (!a && b)
return 0;
if (a && !b)
return 0;
return 1;
}
```
Wh
Hi,
After some detailed study and consideration on how to use the new attribute
“counted_by”
in __builtin_dynamic_object_size, I came up with the following example with
detailed explanation
on the expected behavior from GCC on using this new attribute.
Please take a look on this example and
Am Montag, dem 31.07.2023 um 15:39 -0400 schrieb Siddhesh Poyarekar:
> On 2023-07-21 07:21, Martin Uecker via Gcc-patches wrote:
> >
> >
> > This patch adds a warning for allocations with insufficient size
> > based on the "alloc_size" attribute and the type of the pointer
> > the result is assig
On 2023-07-21 07:21, Martin Uecker via Gcc-patches wrote:
This patch adds a warning for allocations with insufficient size
based on the "alloc_size" attribute and the type of the pointer
the result is assigned to. While it is theoretically legal to
assign to the wrong pointer type and cast it t
Hi Juzhe,
> +/* Expand Vector POPCOUNT by parallel popcnt:
> +
> + int parallel_popcnt(uint32_t n) {
> + #define POW2(c) (1U << (c))
> + #define MASK(c) (static_cast(-1) / (POW2(POW2(c)) + 1U))
> + #define COUNT(x, c) ((x) & MASK(c)) + (((x)>>(POW2(c))) & MASK(c))
> + n = CO
On Mon, 31 Jul 2023, Hamza Mahfooz wrote:
> Hey Joseph,
>
> On Fri, Jul 28 2023 at 08:32:31 PM +00:00:00, Joseph Myers
> wrote:
> > > OK.
> >
> > --
> > Joseph S. Myers
> > jos...@codesourcery.com
>
> Since I don't have write access, do you mind pushing this for me?
Done.
--
Joseph S. Myers
> On Jul 31, 2023, at 2:23 PM, Siddhesh Poyarekar wrote:
>
> On 2023-07-31 14:13, Qing Zhao wrote:
>> Okay. I see.
>> Then if the size info from the TYPE is smaller than the size info from the
>> malloc,
>> then based on the current code, we use the smaller one between these two,
>> i.e, the
On 2023-07-31 14:13, Qing Zhao wrote:
Okay. I see.
Then if the size info from the TYPE is smaller than the size info from the
malloc,
then based on the current code, we use the smaller one between these two,
i.e, the size info from the TYPE. (Even for the OST_MAXIMUM).
Is such behavior co
Hi, Sid,
Thanks a lot.
> On Jul 31, 2023, at 1:07 PM, Siddhesh Poyarekar wrote:
>
> On 2023-07-31 13:03, Siddhesh Poyarekar wrote:
>> On 2023-07-31 12:47, Qing Zhao wrote:
>>> Hi, Sid and Jakub,
>>>
>>> I have a question in the following source portion of the routine
>>> “addr_object_size” of
This slighly improves bitwise_inverted_equal_p
for comparisons. Instead of just comparing the
comparisons operands also valueize them.
This will allow ccp and others to match the 2 comparisons
without an extra pass happening.
OK? Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
This is a simple patch to move these 2 patterns over to use
bitwise_inverted_equal_p. It also allows us to remove 2 other patterns
which were used on comparisons as they are now handled by
the original pattern.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
On Mon, Jul 31, 2023 at 3:53 AM Richard Biener via Gcc-patches
wrote:
>
> On Mon, Jul 31, 2023 at 7:35 AM Andrew Pinski via Gcc-patches
> wrote:
> >
> > Even though these are done by combine_comparisons, we can add them to match
> > to allow simplifcations during match rather than just during
>
This is a new version of the patch.
Instead of doing the matching of inversion comparison directly inside
match, creating a new function (bitwise_inverted_equal_p) to do it.
It is very similar to bitwise_equal_p that was added in
r14-2751-g2a3556376c69a1fb
but instead it says `expr1 == ~expr2`. A
On 2023-07-31 13:03, Siddhesh Poyarekar wrote:
On 2023-07-31 12:47, Qing Zhao wrote:
Hi, Sid and Jakub,
I have a question in the following source portion of the routine
“addr_object_size” of gcc/tree-object-size.cc:
743 bytes = compute_object_offset (TREE_OPERAND (ptr, 0), var);
74
Hi,
when IPA-SRA detects whether a parameter passed by reference is
written to, it does not special case CLOBBERs which means it often
bails out unnecessarily, especially when dealing with C++ destructors.
Fixed by the obvious continue in the two relevant loops.
The (slightly) more complex testca
On 2023-07-31 12:47, Qing Zhao wrote:
Hi, Sid and Jakub,
I have a question in the following source portion of the routine
“addr_object_size” of gcc/tree-object-size.cc:
743 bytes = compute_object_offset (TREE_OPERAND (ptr, 0), var);
744 if (bytes != error_mark_node)
745
On 7/28/23 15:11, Joseph Myers wrote:
This patch is OK.
I fixed the whitespace errors in the patch as well as a couple minor
ChangeLog entry items and pushed Costas's patch to the trunk.
jeff
Hello,
On Tue, Jul 18 2023, Aldy Hernandez wrote:
> On 7/17/23 15:14, Aldy Hernandez wrote:
>> Instead of reading the known zero bits in IPA, read the value/mask
>> pair which is available.
>>
>> There is a slight change of behavior here. I have removed the check
>> for SSA_NAME, as the ranger c
Hi, Sid and Jakub,
I have a question in the following source portion of the routine
“addr_object_size” of gcc/tree-object-size.cc:
743 bytes = compute_object_offset (TREE_OPERAND (ptr, 0), var);
744 if (bytes != error_mark_node)
745 {
746 bytes = size_for_offset
On Mon, 31 Jul 2023, Martin Uecker via Gcc-patches wrote:
> Joseph, I would appreciate if you could take a look at this?
>
> This fixes the remaining issues which requires me to turn the
> warnings off with -Wno-vla-parameter and -Wno-nonnull in my
> projects.
The front-end changes are OK.
--
GCC 13.2 released[2] so I merged the series now that the branch is unfrozen.
Thanks,
Patrick
[2] https://inbox.sourceware.org/gcc/ZMJeq%2FY5SN+7i8a+@tucnak/T/#u
On 7/25/23 11:01, Patrick O'Neill wrote:
Discussed during the weekly RISC-V GCC meeting[1] and pre-approved by
Jeff Law.
If there are
The fold_using_range operand fetching mechanism has a variety of modes.
The "normal" mechanism simply invokes the current or supplied
range_query to satisfy fetching current range info for any ssa-names
used during the evalaution of the statement,
I also added support for fur_list which allow
On 7/28/23 04:17, Robin Dapp via Gcc-patches wrote:
Hi,
this patch extracts the hoist-pressure handling from gcse and puts it
into a separate file so it can be used by other passes in the future.
No functional change and I also abstained from c++ifying the code.
The naming with the regpressur
On 7/31/23 04:53, Richard Biener via Gcc-patches wrote:
On Tue, 25 Jul 2023, Richard Biener wrote:
The following removes the code checking whether a noop copy
is between something involved in the return sequence composed
of a SET and USE. Instead of checking for this special-case
the follow
On 7/31/23 04:54, Richard Biener via Gcc-patches wrote:
On Tue, 25 Jul 2023, Richard Biener wrote:
The following applies a micro-optimization to find_hard_regno_for_1,
re-ordering the check so we can easily jump-thread by using an else.
This reduces the time spent in this function by 15% for
On Fri, 28 Jul 2023, Jason Merrill via Gcc-patches wrote:
> > Thanks, I had thought there could be a potential issue with needing to also
> > check cpp_get_options(pfile)->traditional. But looking at it more, there's
> > no
> > code path currently that can end up here in traditional mode, so yes w
On Mon, 31 Jul 2023, Jeff Law wrote:
> > That's a good suggestion! Thanks, let me try to apply myself workflow :)
> I'm thinking that as part of the CI POC being done by RISE that the base AMI
> image ought to be gcc-13 based and that we should configure the toolchains we
> build with -enable-wer
On Mon, 31 Jul 2023, Kito Cheng wrote:
> > I just configure with `--enable-werror-always', which we want to keep
> > our standards up to anyway,
>
> I rely on the host GCC which is 11 relatively old compared to the
> trunk, so --enable-werror-always will get many -Wformat* warning :(
If buildi
On 7/31/23 06:14, Wang, Yanzhang wrote:
Thanks your comments, Jeff and Robin
Is the mulh case somehow common or critical?
Well, I would actually back up even further. What were the
circumstances that led to the mulh with a zero operand?
I think you both mentioned why should we add the mu
Am Montag, dem 31.07.2023 um 14:33 + schrieb Michael Matz:
> Hello,
>
> On Fri, 28 Jul 2023, Martin Uecker wrote:
>
> > > > Sorry, somehow I must be missing something here.
> > > >
> > > > If you add something you would create a new value and this may (in
> > > > an object) have random new p
On 7/28/23 01:05, Richard Biener via Gcc-patches wrote:
The following delays sinking of loads within the same innermost
loop when it was unconditional before. That's a not uncommon
issue preventing vectorization when masked loads are not available.
Bootstrapped and tested on x86_64-unknown-l
On 7/31/23 08:01, Richard Biener via Gcc-patches wrote:
The following makes sure to limit the shift operand when vectorizing
(short)((int)x >> 31) via (short)x >> 31 as the out of bounds shift
operand otherwise invokes undefined behavior. When we determine
whether we can demote the operand we
On 7/31/23 08:52, Kito Cheng via Gcc-patches wrote:
On Mon, Jul 31, 2023 at 10:03 PM Maciej W. Rozycki wrote:
On Mon, 31 Jul 2023, Kito Cheng wrote:
Sorry for disturbing, pushed a fix for that, and...added
-Werror=unused-variable to my build script to prevent that happen
again :(
I ju
On Mon, Jul 31, 2023 at 10:03 PM Maciej W. Rozycki wrote:
>
> On Mon, 31 Jul 2023, Kito Cheng wrote:
>
> > Sorry for disturbing, pushed a fix for that, and...added
> > -Werror=unused-variable to my build script to prevent that happen
> > again :(
>
> I just configure with `--enable-werror-always'
Hey Joseph,
On Fri, Jul 28 2023 at 08:32:31 PM +00:00:00, Joseph Myers
wrote:
OK.
--
Joseph S. Myers
jos...@codesourcery.com
Since I don't have write access, do you mind pushing this for me?
PING * 2
On Tue, Jul 25, 2023 at 8:32 AM Aldy Hernandez wrote:
>
> Ping
>
> On Mon, Jul 17, 2023, 15:14 Aldy Hernandez wrote:
>>
>> Instead of reading the known zero bits in IPA, read the value/mask
>> pair which is available.
>>
>> There is a slight change of behavior here. I have removed the
Hello,
On Fri, 28 Jul 2023, Martin Uecker wrote:
> > > Sorry, somehow I must be missing something here.
> > >
> > > If you add something you would create a new value and this may (in
> > > an object) have random new padding. But the "expected" value should
> > > be updated by a failed atomic_co
Address comment and fix on V2:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625870.html
Ok for trunk?
juzhe.zh...@rivai.ai
From: Kito Cheng
Date: 2023-07-31 21:38
To: Juzhe-Zhong
CC: gcc-patches; kito.cheng; jeffreyalaw; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Support POPCOUNT auto-vect
Oh, Thanks a lot.
I can test it in RISC-V backend now.
But I have another questions:
>> I'm a bit confused (but also by the existing mask code), whether
>>vect_nargs needs adjustment depends on the IFN in the IL we analyze.
>>If if-conversion recognizes a .COND_ADD then we need to add nothing
>>fo
This patch is inspired by "lowerCTPOP" in LLVM.
Support popcount auto-vectorization by LLVM approach.
Before this patch:
:7:21: missed: couldn't vectorize loop
:8:14: missed: not vectorized: relevant stmt not supported: _5 =
__builtin_popcount (_4);
After this patch:
popcount_32:
ble
On Mon, 31 Jul 2023, Kito Cheng wrote:
> Sorry for disturbing, pushed a fix for that, and...added
> -Werror=unused-variable to my build script to prevent that happen
> again :(
I just configure with `--enable-werror-always', which we want to keep
our standards up to anyway, but if you find this
The following makes sure to limit the shift operand when vectorizing
(short)((int)x >> 31) via (short)x >> 31 as the out of bounds shift
operand otherwise invokes undefined behavior. When we determine
whether we can demote the operand we know we at most shift in the
sign bit so we can adjust the s
On 6/19/23 08:23, Stefan Schulze Frielinghaus via Gcc-patches wrote:
Comparisons between memory and constants might be done in a smaller mode
resulting in smaller constants which might finally end up as immediates
instead of in the literal pool.
For example, on s390x a non-symmetric compariso
On Mon, 31 Jul 2023, ??? wrote:
> Yeah. I have tried this case too.
>
> But this case doesn't need to be vectorized as COND_FMA, am I right?
Only when you enable loop masking. Alternatively use
double foo (double *a, double *b, double *c)
{
double result = 0.0;
for (int i = 0; i < 1024; ++
>> Drop outer loop if word_size never larger than 1?
Yeah. we never have TI vector modes for now.
The codes I just directly copy from LLVM in generic intrinsic handling :)
since LLVM generic code is considering handling INT128 vector
I will remove all redundant code for INT128 vector mode in
On Mon, Jul 31, 2023 at 1:07 PM Andrzej Turko via Gcc-patches
wrote:
>
> Currently fprintf calls logging to a dump file take line numbers
> in the match.pd file directly as arguments.
> When match.pd is edited, referenced code changes line numbers,
> which causes changes to many fprintf calls and,
Ok. Thanks. Li Pan is still testing.
juzhe.zh...@rivai.ai
From: Kito Cheng
Date: 2023-07-31 21:45
To: Kito Cheng
CC: Juzhe-Zhong; gcc-patches; jeffreyalaw; macro; pan2.li; rdapp.gcc
Subject: Re: [committed] RISC-V: Fix bug of get_mask_mode
I saw you didn't push yet, so I pushed another pa
Hi Maciej:
Sorry for disturbing, pushed a fix for that, and...added
-Werror=unused-variable to my build script to prevent that happen
again :(
On Mon, Jul 31, 2023 at 7:08 PM Maciej W. Rozycki wrote:
>
> On Mon, 31 Jul 2023, Kito Cheng via Gcc-patches wrote:
>
> > Pushed, thanks :)
>
> This bre
I saw you didn't push yet, so I pushed another patch to fix those
unused variable issues.
On Mon, Jul 31, 2023 at 9:12 PM Kito Cheng wrote:
>
> Ooops, I guess my code base was too old, and forgot to check that after
> rebase, thanks for fix that!
>
> Juzhe-Zhong 於 2023年7月31日 週一,20:21寫道:
>>
>> Fi
Yeah. I have tried this case too.
But this case doesn't need to be vectorized as COND_FMA, am I right?
The thing I wonder is that whether this condtion:
if (mask_opno >= 0 && reduc_idx >= 0)
or similar as len
if (len_opno >= 0 && reduc_idx >= 0)
Whether they are redundant in vectorizable_cal
On Mon, Jul 31, 2023 at 1:06 PM Andrzej Turko via Gcc-patches
wrote:
>
> So far genmatch has been using an unordered map to store information about
> functions to be generated. Since corresponding locations from match.pd were
> used as keys in the map, even small changes to match.pd which caused
>
On Mon, Jul 31, 2023 at 1:06 PM Andrzej Turko via Gcc-patches
wrote:
>
> Get_or_insert method is already supported by the unordered hash map.
> Adding it to the ordered map enables us to replace the unordered map
> with the ordered one in cases where ordering may be useful.
OK. Note the Makefile
On Mon, Jul 31, 2023 at 8:03 PM Juzhe-Zhong wrote:
>
> This patch is inspired by "lowerCTPOP" in LLVM.
> Support popcount auto-vectorization by following LLVM approach.
> https://godbolt.org/z/3K3GzvY7f
>
> Before this patch:
>
> :7:21: missed: couldn't vectorize loop
> :8:14: missed: not vectoriz
On Mon, 31 Jul 2023, juzhe.zh...@rivai.ai wrote:
> Hi, Richi.
>
> >> I think you need to use fma from math.h together with -ffast-math
> >>to get fma.
>
> As you said, this is one of the case I tried:
> https://godbolt.org/z/xMzrrv5dT
> GCC failed to vectorize.
>
> Could you help me with this?
ping
On Mon, Jun 19, 2023 at 04:23:57PM +0200, Stefan Schulze Frielinghaus wrote:
> Comparisons between memory and constants might be done in a smaller mode
> resulting in smaller constants which might finally end up as immediates
> instead of in the literal pool.
>
> For example, on s390x a non-
Ooops, I guess my code base was too old, and forgot to check that after
rebase, thanks for fix that!
Juzhe-Zhong 於 2023年7月31日 週一,20:21寫道:
> Fix bugs:
> ../../../riscv-gcc/gcc/config/riscv/riscv-v.cc: In function ‘void
> riscv_vector::emit_vlmax_masked_fp_mu_insn(unsigned int, int, rtx_def**)’:
>
statement_sink_location for loads is currently confused about
stores that are not on the paths we are sinking across. The
following avoids this by explicitely checking whether a block
with a store is on any of those paths. To not perform too many
walks over the sub-part of the CFG between the ori
Fix bugs:
../../../riscv-gcc/gcc/config/riscv/riscv-v.cc: In function ???void
riscv_vector::emit_vlmax_masked_fp_mu_insn(unsigned int, int, rtx_def**)???:
../../../riscv-gcc/gcc/config/riscv/riscv-v.cc:999:54: error: request for
member ???require??? in ???riscv_vector::get_mask_mode(dest_mode)???
Committed, thanks Richard.
Pan
-Original Message-
From: Gcc-patches On Behalf
Of Richard Sandiford via Gcc-patches
Sent: Monday, July 31, 2023 6:17 PM
To: juzhe.zh...@rivai.ai
Cc: gcc-patches@gcc.gnu.org; rguent...@suse.de
Subject: Re: [PATCH] internal-fn: Refine macro define of COND_*
Thanks your comments, Jeff and Robin
> > Is the mulh case somehow common or critical?
> Well, I would actually back up even further. What were the
> circumstances that led to the mulh with a zero operand?
I think you both mentioned why should we add the mulh * 0 simplify.
Unfortunately, I hav
Hi, Richi.
>> I think you need to use fma from math.h together with -ffast-math
>>to get fma.
As you said, this is one of the case I tried:
https://godbolt.org/z/xMzrrv5dT
GCC failed to vectorize.
Could you help me with this?
Thanks.
juzhe.zh...@rivai.ai
From: Richard Biener
Date: 2023-07-
On Thu, 27 Jul 2023 at 12:04, Richard Biener wrote:
>
> On Wed, 26 Jul 2023, Prathamesh Kulkarni wrote:
>
> > Sorry, I meant PR110280 in subject line (not PR10280).
>
> OK after 13.2 is released and the branch is open again.
Thanks, committed the patch to releases/gcc-13 branch in:
https://gcc.gnu
This patch is inspired by "lowerCTPOP" in LLVM.
Support popcount auto-vectorization by following LLVM approach.
https://godbolt.org/z/3K3GzvY7f
Before this patch:
:7:21: missed: couldn't vectorize loop
:8:14: missed: not vectorized: relevant stmt not supported: _5 =
__builtin_popcount (_4);
Aft
On Mon, 31 Jul 2023, juzhe.zh...@rivai.ai wrote:
> Ok . Thanks Richard.
>
> Could you give me a case that SVE can vectorize a reduction with FMA?
> Meaning it will go into vectorize_call and vectorize FMA into COND_FMA ?
>
> I tried many times to reproduce such cases but I failed.
I think you n
Ok . Thanks Richard.
Could you give me a case that SVE can vectorize a reduction with FMA?
Meaning it will go into vectorize_call and vectorize FMA into COND_FMA ?
I tried many times to reproduce such cases but I failed.
Thanks.
juzhe.zh...@rivai.ai
From: Richard Sandiford
Date: 2023-07-31 1
Hi Dave,
On Fri, Jul 21, 2023 at 10:10 PM David Malcolm wrote:
[...]
It looks like something's gone wrong with the indentation in the above:
> previously we had tab characters, but now I'm seeing a pair of spaces,
> which means this wouldn't line up properly. This might be a glitch
> somewhere
On Mon, 31 Jul 2023, Kito Cheng via Gcc-patches wrote:
> Pushed, thanks :)
This breaks compilation:
.../gcc/config/riscv/riscv-v.cc: In function 'void
riscv_vector::expand_vec_series(rtx, rtx, rtx)':
.../gcc/config/riscv/riscv-v.cc:1251:16: error: unused variable 'mask_mode'
[-Werror=unused-v
Currently fprintf calls logging to a dump file take line numbers
in the match.pd file directly as arguments.
When match.pd is edited, referenced code changes line numbers,
which causes changes to many fprintf calls and, thus, to many
(usually all) .cc files generated by genmatch. This forces make
t
The following reduces the number of object files that need to be rebuilt
after match.pd has been modified. Right now a change to match.pd which
adds/removes a line almost always forces recompilation of all files that
genmatch generates from it. This is because of unnecessary changes to
the generate
Get_or_insert method is already supported by the unordered hash map.
Adding it to the ordered map enables us to replace the unordered map
with the ordered one in cases where ordering may be useful.
Signed-off-by: Andrzej Turko
gcc/ChangeLog:
* ordered-hash-map.h: Add get_or_insert.
So far genmatch has been using an unordered map to store information about
functions to be generated. Since corresponding locations from match.pd were
used as keys in the map, even small changes to match.pd which caused
line number changes would change the order in which the functions are
generated
>> Ah. So then just feed it cond_fn? I mean, we don't have
>> LEN_FMA, the only LEN-but-not-MASK ifns are those used by
>> power/s390, LEN_LOAD and LEN_STORE?
Yes, that's why I feed cond_fn with get_len_internal_fn (cond_fn)
>> Yes, but all of this depends on what the original ifn is, no?
Yes.
On Tue, 25 Jul 2023, Richard Biener wrote:
> The following applies a micro-optimization to find_hard_regno_for_1,
> re-ordering the check so we can easily jump-thread by using an else.
> This reduces the time spent in this function by 15% for the testcase
> in the PR.
>
> Bootstrap & regtest runn
On Tue, 25 Jul 2023, Richard Biener wrote:
> The following removes the code checking whether a noop copy
> is between something involved in the return sequence composed
> of a SET and USE. Instead of checking for this special-case
> the following makes us only ever remove noop copies between
> ps
1 - 100 of 121 matches
Mail list logo