>> Umm, this patch has been queued up for at least a couple weeks now.
Oh. I am sorry I didn't see this patch since this patch doesn't CC me.
I didn't subscribe GCC-patch, so I may miss some patches that didn't explicitly
CC me.
I just happen to see your reply email today then reply.
juzhe.zh
>> Testing what specifically? Are you asking for correctness tests,
>> performance/code quality tests?
Add memcpy test using RVV instructions, just like we are adding testcases for
auto-vectorization support.
For example:
#include
#include
#include
void foo (int32_t * a, int32_t * b, int n
On 7/20/23 18:59, HAO CHEN GUI wrote:
Hi Jeff,
在 2023/7/21 5:27, Jeff Law 写道:
Wouldn't it make more sense to just try rotate/mask in the original mode before
trying a shift in a widened mode? I'm not sure why we need a target hook here.
There is no change to try rotate/mask with the orig
On 8/4/23 17:10, 钟居哲 wrote:
Could you add testcases for this patch?
Testing what specifically? Are you asking for correctness tests,
performance/code quality tests?
+;; The (use (and (match_dup 1) (const_int 127))) is here to prevent the
+;; optimizers from changing cpymem_loop_* into t
Could you add testcases for this patch?
+;; The (use (and (match_dup 1) (const_int 127))) is here to prevent the
+;; optimizers from changing cpymem_loop_* into this.
+(define_insn "@cpymem_straight"
+ [(set (mem:BLK (match_operand:P 0 "register_operand" "r,r"))
+ (mem:BLK (match_operand:P
On Fri, 4 Aug 2023, Richard Biener via Gcc-patches wrote:
> > Sorry, I hoped it wouldn't take me almost 3 months and would be much shorter
> > as well, but clearly I'm not good at estimating stuff...
>
> Well, it’s definitely feature creep with now the _Decimal and bitfield stuff …
I think featu
Adds a simplification for ((x ^ y) & z) | x to be folded into
(z & y) | x. Merges this simplification with ((x | y) & z) | x -> (z & y) | x
to prevent duplicate pattern. Tested successfully on x86_64 and x86 targets.
PR tree-opt/109938
gcc/ChangeLog:
* match.pd ((x ^ y) & z) | x
On 8/1/23 19:38, Xiao Zeng wrote:
This patch recognizes Zicond patterns when the select pattern
with condition eq or neq to 0 (using eq as an example), namely:
1 rd = (rs2 == 0) ? non-imm : 0
2 rd = (rs2 == 0) ? non-imm : non-imm
3 rd = (rs2 == 0) ? reg : non-imm
4 rd = (rs2 == 0) ? reg : reg
On 8/1/23 00:47, Robin Dapp via Gcc-patches wrote:
I'm not against continuing with the more well-known approach for now
but we should keep in mind that might still be potential for improvement.
No. I don't think it's faster.
I did a quick check on my x86 laptop and it's roughly 25% fas
On 7/17/23 22:47, Joern Rennecke wrote:
Subject:
cpymem for RISCV with v extension
From:
Joern Rennecke
Date:
7/17/23, 22:47
To:
GCC Patches
As discussed on last week's patch call, this patch uses either a
straight copy or an opaque pattern that emits the loop as assembly to
optimize cpym
This patch makes -fanalyzer make use of the function attribute
"alloc_size", allowing -fanalyzer to emit -Wanalyzer-allocation-size,
-Wanalyzer-out-of-bounds, and -Wanalyzer-tainted-allocation-size on
execution paths involving allocations using such functions.
Successfully bootstrapped & regrteste
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r14-3000-g187b213ddbe7ea.
gcc/analyzer/ChangeLog:
* svalue.cc (region_svalue::dump_to_pp): Support NULL type.
(constant_svalue::dump_to_pp): Likewise.
(initial_svalue::dump_to_pp): Likewise.
On 8/4/23 03:29, Xiao Zeng wrote:
On Thu, Aug 03, 2023 at 01:20:00 AM Jeff Law wrote:
In the wrong two optimization modes, I only considered the
case of satisfying the ELSE branch, but in fact, like the correct
two optimization modes, I should consider the case of satisfying
both the THA
gcc/c-family/ChangeLog:
PR C/108896
* c-ubsan.cc (ubsan_instrument_bounds): Use counted_by attribute
information.
gcc/testsuite/ChangeLog:
PR C/108896
* gcc.dg/ubsan/flex-array-counted-by-bounds.c: New test.
* gcc.dg/ubsan/flex-array-counted-by-bou
gcc/ChangeLog:
PR C/108896
* tree-object-size.cc (addr_object_size): Use the counted_by
attribute info.
* tree.cc (component_ref_has_counted_by_p): New function.
(component_ref_get_counted_by): New function.
* tree.h (component_ref_has_counted_by_p):
'counted_by (COUNT)'
The 'counted_by' attribute may be attached to the flexible array
member of a structure. It indicates that the number of the
elements of the array is given by the field named "COUNT" in the
same structure as the flexible array member. GCC uses this
inf
Hi,
This is the 2nd version of the patch, per our discussion based on the
review comments for the 1st version, the major changes in this version
are:
1. change the name "element_count" to "counted_by";
2. change the parameter for the attribute from a STRING to an
Identifier;
3. Add logic and test
> On Aug 4, 2023, at 3:09 PM, Siddhesh Poyarekar wrote:
>
> On 2023-08-04 15:06, Qing Zhao wrote:
>>> Yes, that's what I'm thinking.
>>>
> so `q` must be pointing to a single element. So you could deduce:
>
> 1. the minimum size of the whole object that q points to.
You mean
On 2023-08-04 15:06, Qing Zhao wrote:
Yes, that's what I'm thinking.
so `q` must be pointing to a single element. So you could deduce:
1. the minimum size of the whole object that q points to.
You mean that the TYPE will determine the minimum size of the whole object?
(Does this include th
On Thu, Aug 3, 2023 at 10:31 PM Jeff Law via Gcc-patches
wrote:
>
>
>
> On 8/3/23 17:38, Vineet Gupta wrote:
>
> >> ;-) Actually if you wanted to poke at zicond, the most interesting
> >> unexplored area I've come across is the COND_EXPR handling in gimple.
> >> When we expand a COND_EXPR into RT
> On Aug 4, 2023, at 12:36 PM, Siddhesh Poyarekar wrote:
>
> On 2023-08-04 11:27, Qing Zhao wrote:
>>> On Aug 4, 2023, at 10:40 AM, Siddhesh Poyarekar wrote:
>>>
>>> On 2023-08-03 13:34, Qing Zhao wrote:
One thing I need to point out first is, currently, even for regular fixed
size
Hi!
Repost because the patch was too large.
On Fri, Jul 28, 2023 at 06:03:33PM +, Joseph Myers wrote:
> Note that representations with too-large significand are defined to be
> noncanonical representations of zero, so you need to take care of that in
> decoding BID.
Done.
> You could e.g. h
The patch at the end adds a warning when a tail/sibling call cannot be
optimized for various reasons.
I built and tested GCC with and without the patch with configuration
Configured with: ../../gcc-mainline/configure --enable-languages=c
--disable-multilib --prefix=/pkgs/gcc-mainline --disable
> Am 04.08.2023 um 18:26 schrieb Martin Jambor :
>
> Hello,
>
>> On Wed, Aug 02 2023, Richard Biener wrote:
>>> On Mon, Jul 31, 2023 at 7:05 PM Martin Jambor wrote:
>>>
>>> Hi,
>>>
>>> when IPA-SRA detects whether a parameter passed by reference is
>>> written to, it does not special case
On Fri, 4 Aug 2023, Martin Uecker via Gcc-patches wrote:
> Here is a patch to reduce false positives in _Generic.
>
> Bootstrapped and regression tested on x86_64-linux.
>
> Martin
>
> c: _Generic should not warn in non-active branches [PR68193,PR97100]
>
> To avoid false diagnosti
On 2023-08-04 11:27, Qing Zhao wrote:
On Aug 4, 2023, at 10:40 AM, Siddhesh Poyarekar wrote:
On 2023-08-03 13:34, Qing Zhao wrote:
One thing I need to point out first is, currently, even for regular fixed size
array in the structure,
We have this same issue, for example:
#define LENGTH 10
Hello,
On Wed, Aug 02 2023, Richard Biener wrote:
> On Mon, Jul 31, 2023 at 7:05 PM Martin Jambor wrote:
>>
>> Hi,
>>
>> when IPA-SRA detects whether a parameter passed by reference is
>> written to, it does not special case CLOBBERs which means it often
>> bails out unnecessarily, especially whe
On 8/4/23 03:52, Manolis Tsamis wrote:
Hi all,
It is true that regcprop currently does not propagate sp and hence
leela is not optimized, but from what I see this should be something
we can address.
The reason that the propagation fails is this check that I have added
when I introduced maybe
On Fri, Aug 04, 2023 at 01:25:07PM +, Richard Biener wrote:
> > @@ -144,6 +144,9 @@ DEFTREECODE (BOOLEAN_TYPE, "boolean_type
> > and TYPE_PRECISION (number of bits used by this type). */
> > DEFTREECODE (INTEGER_TYPE, "integer_type", tcc_type, 0)
Thanks.
> > +/* Bit-precise integer type
> On Aug 4, 2023, at 10:42 AM, Siddhesh Poyarekar wrote:
>
> On 2023-08-04 10:40, Siddhesh Poyarekar wrote:
>> On 2023-08-03 13:34, Qing Zhao wrote:
>>> One thing I need to point out first is, currently, even for regular fixed
>>> size array in the structure,
>>> We have this same issue, for e
> On Aug 4, 2023, at 10:40 AM, Siddhesh Poyarekar wrote:
>
> On 2023-08-03 13:34, Qing Zhao wrote:
>> One thing I need to point out first is, currently, even for regular fixed
>> size array in the structure,
>> We have this same issue, for example:
>> #define LENGTH 10
>> struct fix {
>> siz
ping
From: Wilco Dijkstra
Sent: 02 June 2023 18:28
To: GCC Patches
Cc: Richard Sandiford ; Kyrylo Tkachov
Subject: [PATCH] libatomic: Enable lock-free 128-bit atomics on AArch64
[PR110061]
Enable lock-free 128-bit atomics on AArch64. This is backwards compatible with
existing binaries, gi
Full review this time, sorry for the skipping the tests earlier.
Prathamesh Kulkarni writes:
> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> index 7e5494dfd39..680d0e54fd4 100644
> --- a/gcc/fold-const.cc
> +++ b/gcc/fold-const.cc
> @@ -85,6 +85,10 @@ along with GCC; see the file COPYING3.
Add support for ifunc selection based on CPUID register. Neoverse N1 supports
atomic 128-bit load/store, so use the FEAT_USCAT ifunc like newer Neoverse
cores.
Passes regress, OK for commit?
libatomic/
config/linux/aarch64/host-config.h (ifunc1): Use CPUID in ifunc
selection.
-
On 2023-08-04 10:40, Siddhesh Poyarekar wrote:
On 2023-08-03 13:34, Qing Zhao wrote:
One thing I need to point out first is, currently, even for regular
fixed size array in the structure,
We have this same issue, for example:
#define LENGTH 10
struct fix {
size_t foo;
int array[LENGTH];
On 2023-08-03 13:34, Qing Zhao wrote:
One thing I need to point out first is, currently, even for regular fixed size
array in the structure,
We have this same issue, for example:
#define LENGTH 10
struct fix {
size_t foo;
int array[LENGTH];
};
…
int main ()
{
struct fix *p;
p = al
Thanks.
I just updated the doc per your suggestion and committed as:
https://gcc.gnu.org/pipermail/gcc-cvs/2023-August/387588.html
Qing
> On Aug 3, 2023, at 1:29 PM, Joseph Myers wrote:
>
> On Thu, 3 Aug 2023, Qing Zhao via Gcc-patches wrote:
>
>> +@opindex Wflex-array-member-not-at-end
>> +@
> On Aug 4, 2023, at 3:38 AM, Kees Cook wrote:
>
> On Thu, Aug 03, 2023 at 09:31:24PM +, Qing Zhao wrote:
>> So, the basic question is:
>>
>> Given the following:
>>
>> struct fix {
>> int others;
>> int array[10];
>> }
>>
>> extern struct fix * alloc_buf ();
>>
>> int main ()
>> {
>>
Richard Biener writes:
> The following fixes a problem with my last attempt of avoiding
> out-of-bound shift values for vectorized right shifts of widened
> operands. Instead of truncating the shift amount with a bitwise
> and we actually need to saturate it to the target precision.
>
> The follo
The following patch fixes a problem found by LRA port for avr target.
The problem description is in the commit message.
The patch was successfully bootstrapped and tested on x86-64 and aarch64.
commit abf953042ace471720c1dc284b5f38e546fc0595
Author: Vladimir N. Makarov
Date: Fri Aug 4 08:04:
The following fixes a problem with my last attempt of avoiding
out-of-bound shift values for vectorized right shifts of widened
operands. Instead of truncating the shift amount with a bitwise
and we actually need to saturate it to the target precision.
The following does that and adds test covera
>
> Like I mentioned in the other thread, I think things went wrong when
> we generated the subreg in this sign_extend. The operation should
> have been a truncate of (reg/v:DI 200) followed by a sign extension
> of the result.
>
Sorry for my misunderstanding.
So you mean that in the RTL, for th
On Fri, 4 Aug 2023, Matthew Malcomson wrote:
> Hopefully last update ...
>
> > Specifically, please try compiling with
> >-ftime-report -fdiagnostics-format=sarif-file
> > and have a look at the generated .sarif file, e.g. via
> >python -m json.tool foo.c.sarif
> > which will pretty-print
The following adjusts the shift simplification patterns to avoid
touching out-of-bound shift value arithmetic right shifts of
possibly negative values. While simplifying those to zero isn't
wrong it's violating the principle of least surprise.
Bootstrapped and tested on x86_64-unknown-linux-gnu,
Ping!
please review.
Thanks & Regards
Jeevitha
On 19/07/23 10:16 pm, jeevitha wrote:
> Hi All,
>
> The following patch has been bootstrapped and regtested on powerpc64le-linux.
>
> There are no instructions that do traditional AltiVec addresses (i.e.
> with the low four bits of the address mas
Ping!
please review.
Thanks & Regards
Jeevitha
On 20/07/23 10:05 am, jeevitha wrote:
> Hi All,
>
> The following patch has been bootstrapped and regtested on powerpc64le-linux.
>
> When the user specifies PTImode as an attribute, it breaks. Created
> a tree node to handle PTImode types. PTImod
On Thu, 3 Aug 2023 at 18:15, Richard Sandiford
wrote:
>
> can_div_trunc_p (a, b, &Q, &r) tries to compute a Q and r that
> satisfy the usual conditions for truncating division:
>
> (1) a = b * Q + r
> (2) |b * Q| <= |a|
> (3) |r| < |b|
>
> We can compute Q using the constant compone
On Thu, 3 Aug 2023 at 18:46, Richard Sandiford
wrote:
>
> Richard Sandiford writes:
> > Prathamesh Kulkarni writes:
> >> On Tue, 25 Jul 2023 at 18:25, Richard Sandiford
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>> Thanks for the rework and sorry for the slow review.
> >> Hi Richard,
> >> Thanks for
Hi all,
It is true that regcprop currently does not propagate sp and hence
leela is not optimized, but from what I see this should be something
we can address.
The reason that the propagation fails is this check that I have added
when I introduced maybe_copy_reg_attrs:
else if (REG_POINTER (new_
Hopefully last update ...
> Specifically, please try compiling with
>-ftime-report -fdiagnostics-format=sarif-file
> and have a look at the generated .sarif file, e.g. via
>python -m json.tool foo.c.sarif
> which will pretty-print the JSON to stdout.
Rebasing onto the JSON output was quit
YunQiang Su writes:
> PR #104914
>
> On TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true platforms,
> zero_extract (SI, SI) can be sign-extended. So, if a zero_extract (DI,
> DI) following with an sign_extend(SI, DI) can be merged to a single
> zero_extract (SI, SI).
>
> gcc/ChangeLog:
>
On Thu, Aug 03, 2023 at 01:20:00 AM Jeff Law wrote:
>
>
>
>So we're being a bit too aggressive with the .opt zicond patterns.
>
>
>> (define_insn "*czero.eqz..opt1"
>> [(set (match_operand:GPR 0 "register_operand" "=r")
>> (if_then_else:GPR (eq (match_operand:X 1 "regi
On Fri, Aug 4, 2023 at 10:52 AM Jan Hubicka wrote:
>
> Hi,
> so I found the problem. We duplicate multiple paths and end up with:
>
> ;; basic block 6, loop depth 0, count 365072224 (estimated locally, freq
> 0.3400)
> ;; prev block 12, next block 7, flags: (NEW, REACHABLE, VISITED)
> ;; pred:
On Thu, Aug 3, 2023 at 4:24 PM Andrzej Turko via Gcc-patches
wrote:
>
> Currently fprintf calls logging to a dump file take line numbers
> in the match.pd file directly as arguments.
> When match.pd is edited, referenced code changes line numbers,
> which causes changes to many fprintf calls and,
This adds some more Xmega like devices to the avr backend.
Johann
AVR: Add some more devices: AVR16DD*, AVR32DD*, AVR64DD*, AVR64EA*,
ATtiny42*, ATtiny82*, ATtiny162*, ATtiny322*, ATtiny10*.
gcc/
* config/avr/avr-mcus.def (avr64dd14, avr64dd20, avr64dd28,
avr64dd32)
(
> On Fri, Aug 4, 2023 at 9:16 AM Jan Hubicka via Gcc-patches
> wrote:
> >
> > Hi,
> > this prevents useless loop distribiton produced in hmmer. With FDO we now
> > correctly work out that the loop created for last iteraiton is not going to
> > iterate however loop distribution still produces a ve
This fixes some minor typos in avr-mcus.def.
Johan
gcc/
* config/avr/avr-mcus.def (avr128d*, avr64d*): Fix their
FLASH_SIZE
and PM_OFFSET entries.
diff --git a/gcc/config/avr/avr-mcus.def b/gcc/config/avr/avr-mcus.def
index ca99116adab..d0056c960ee 100644
--- a/gc
Hi,
so I found the problem. We duplicate multiple paths and end up with:
;; basic block 6, loop depth 0, count 365072224 (estimated locally, freq 0.3400)
;; prev block 12, next block 7, flags: (NEW, REACHABLE, VISITED)
;; pred: 4 [never (guessed)] count:0 (estimated locally, freq 0.)
On Fri, Aug 4, 2023 at 9:16 AM Jan Hubicka via Gcc-patches
wrote:
>
> Hi,
> this prevents useless loop distribiton produced in hmmer. With FDO we now
> correctly work out that the loop created for last iteraiton is not going to
> iterate however loop distribution still produces a verioned loop th
On Thu, Aug 03, 2023 at 09:31:24PM +, Qing Zhao wrote:
> So, the basic question is:
>
> Given the following:
>
> struct fix {
> int others;
> int array[10];
> }
>
> extern struct fix * alloc_buf ();
>
> int main ()
> {
> struct fix *p = alloc_buf ();
> __builtin_object_size(p->array
On Thu, Aug 03, 2023 at 07:55:54PM +, Qing Zhao wrote:
>
>
> > On Aug 3, 2023, at 1:51 PM, Kees Cook wrote:
> >
> > On August 3, 2023 10:34:24 AM PDT, Qing Zhao wrote:
> >> One thing I need to point out first is, currently, even for regular fixed
> >> size array in the structure,
> >> We
Hi,
this prevents useless loop distribiton produced in hmmer. With FDO we now
correctly work out that the loop created for last iteraiton is not going to
iterate however loop distribution still produces a verioned loop that has no
chance to survive loop vectorizer since we only keep distributed lo
> Canonicalizes (signed x << c) >> c into the lowest
> precision(type) - c bits of x IF those bits have a mode precision or a
> precision of 1. Also combines this rule with (unsigned x << c) >> c -> x &
> ((unsigned)-1 >> c) to prevent duplicate pattern. Tested successfully on
> x86_64 and x86 targ
> >
> > A couple cycles ago I separated most of code to distinguish between the
> > back and forward threaders. There is class jt_path_registry that is
> > common to both, and {fwd,back}_jt_path_registry for the forward and
> > backward threaders respectively. It's not perfect, but it's a start.
64 matches
Mail list logo