Could you show me the a piece of codegen difference in X264 that make
performance improved ?
juzhe.zh...@rivai.ai
From: Robin Dapp
Date: 2025-01-22 15:29
To: gcc-patches
CC: pal...@dabbelt.com; kito.ch...@gmail.com; juzhe.zh...@rivai.ai;
jeffreya...@gmail.com; pan2...@intel.com; rdapp@g
Hi,
after testing on the BPI (4.2% improvement for x264 input 1, 4.4% for input 2)
and the discussion in PR117173 I figured it's best to disable the two-source
permutes by default for now. We quickly talked about this on the patchwork
call last week. Conclusion was to just post the patch and dis
On Wed, 2025-01-22 at 12:00 +0800, Lulu Cheng wrote:
> >
> > Currently, command fusion can only be done in the following situations:
> >
> > bstrpick.d rd, rs, 31, 0 + alsl.d rd1,rj,rk,shamt and "rd = rj"
> I learned from my colleagues that to do command fusion, we need to meet
> rd! =rs.
Hmm
在 2025/1/21 下午4:41, Lulu Cheng 写道:
在 2025/1/21 下午12:59, Xi Ruoyao 写道:
On Tue, 2025-01-21 at 11:46 +0800, Lulu Cheng wrote:
在 2025/1/18 下午7:33, Xi Ruoyao 写道:
/* snip */
;; This code iterator allows unsigned and signed division to be
generated
;; from the same template.
@@ -3083,39 +30
Hi all,
These two testcases are misses on previous addition for
-march=x86-64-v3 to silence warning for -march=native tests.
Ok for trunk?
Thx,
Haochen
gcc/testsuite/ChangeLog:
* gcc.target/i386/vnniint16-auto-vectorize-4.c: Append
-march=x86-64-v3.
* gcc.target/i386/vn
On Wed, 2025-01-22 at 10:37 +0800, Lulu Cheng wrote:
>
> 在 2025/1/22 上午8:49, Xi Ruoyao 写道:
> > The second source register of this insn cannot be the same as the
> > destination register.
> >
> > gcc/ChangeLog:
> >
> > * config/loongarch/loongarch.md
> > (_alsl_reversesi_extended): Add '&
在 2025/1/22 上午8:49, Xi Ruoyao 写道:
The second source register of this insn cannot be the same as the
destination register.
gcc/ChangeLog:
* config/loongarch/loongarch.md
(_alsl_reversesi_extended): Add '&' to the destination
register constraint and append '0' to the fir
On Tue, Jan 21, 2025 at 4:42 PM Haochen Jiang wrote:
>
> Hi all,
>
> Recently, DMR ISAs got lots of changes in mnemonics. The detailed change
> are:
>
> - NE would be removed for all AVX10.2 new insns
> - VCOMSBF16 -> VCOMISBF16
> - P for packed omitted for AI data types (BF16, TF32, FP8)
>
On Thu, Aug 8, 2024 at 2:07 PM Andrew Pinski wrote:
>
> On Fri, Aug 2, 2024 at 7:30 AM Jeff Law wrote:
> >
> >
> >
> > On 8/1/24 4:12 AM, Surya Kumari Jangala wrote:
> > > lra: emit caller-save register spills before call insn [PR116028]
> > >
> > > LRA emits insns to save caller-save registers i
Thanks Jeff for comments.
> So a bit of high level background why this is needed would be helpful.
I see. The problem comes from the gen_lowpart when passing the args to SAT_SUB
directly(aka without func args).
SAT_SUB with args, we have input rtx (subreg/s/u:QI (reg/v:DI 135 [ x ]) 0),
and th
The uarch can fuse bstrpick.d rd,rs1,31,0 and alsl.d rd,rd,rs2,shamt,
so for this special case we should use alsl.d instead of slli.d. And
I'd hoped late combine to handle slli.d + and + add.d => and + slli.d +
add.d => and + alsl.d, but it does not always work (even before the
alsl.d special case
The second source register of this insn cannot be the same as the
destination register.
gcc/ChangeLog:
* config/loongarch/loongarch.md
(_alsl_reversesi_extended): Add '&' to the destination
register constraint and append '0' to the first source register
constraint
The first patch fixes a wrong-code caused by
_alsl_reversesi_extended which mistakenly accepted the same hard
register for the destination and the addend.
The second patch partially fixes the performance regression caused by
failing to combine the instructions in some cases and failing to utilize
On Tue, Jan 21, 2025 at 11:00:13AM -0500, Jason Merrill wrote:
> On 1/21/25 9:54 AM, Jason Merrill wrote:
> > On 1/20/25 5:58 PM, Marek Polacek wrote:
> > > @@ -9087,7 +9092,9 @@ cxx_eval_outermost_constant_expr (tree t, bool
> > > allow_non_constant,
> > > return r;
> > > else if (non_co
This patch simply adds an op2_range to operator_div which returns
non-zero if the LHS is not undefined. This means given and integral
division:
x = y / z
'z' will have a range of [-INF, -1] [1, +INF] after execution of the
statement.
This is relatively straightforward and resolve
I've had a TODO for a while to remove the CMO patch from October;
essentially the C API docs for this stuff aren't approved yet. Until
that MR gets merged we should not expose this API.
Specifically I'm reverting:
Revert "[PATCH 1/2] RISC-V:Add intrinsic support for the CMOs extensions
On 1/20/25 2:18 AM, pan2...@intel.com wrote:
From: Pan Li
This patch would like to fix the wroing code generation for the scalar
signed SAT_SUB. The input can be QI/HI/SI/DI while the alu like sub
can only work on Xmode, thus we need to make sure the value of input
are well signed-extended
On 1/5/25 3:01 PM, Simon Martin wrote:
We currently fail with a checking assert upon the following valid code
when using -fno-elide-constructors
=== cut here ===
struct d { ~d(); };
d &b();
struct f {
[[__no_unique_address__]] d e;
};
struct h : f {
h() : f{b()} {}
} i;
=== cut here ===
On Tue, Jan 21, 2025 at 05:21:52PM -0500, Jason Merrill wrote:
> > + v should be initialized with make_tree_vector (); followed by
> > + vec_safe_reserve (v, nelts); or equivalently vec_alloc (v, nelts);
> > + optionally followed by pushes of other elements (up to
> > + nelts - CONSTRUCTOR_
On 1/21/25 5:02 PM, Jakub Jelinek wrote:
On Tue, Jan 21, 2025 at 04:39:58PM -0500, Jason Merrill wrote:
On 1/21/25 11:15 AM, Jakub Jelinek wrote:
On Tue, Jan 21, 2025 at 11:06:35AM -0500, Jason Merrill wrote:
--- gcc/c-family/c-common.cc.jj 2025-01-20 18:00:35.667875671 +0100
+++ gcc/c-family/
On 1/16/25 5:42 PM, Marek Polacek wrote:
On Wed, Jan 15, 2025 at 04:18:36PM -0500, Jason Merrill wrote:
On 1/15/25 12:55 PM, Marek Polacek wrote:
On Wed, Jan 15, 2025 at 09:39:41AM -0500, Jason Merrill wrote:
On 11/15/24 9:08 AM, Marek Polacek wrote:
Bootstrapped/regtested on x86_64-pc-linux-
On 1/16/25 7:10 PM, Andrew Pinski wrote:
While adding a new match pattern, g++.dg/cpp2a/consteval36.C started to ICE and
that was
because we would call fold even if one of the operands of the comparison was an
error_mark_node.
I found a new testcase which also ICEs before this patch too so show
On Tue, Jan 21, 2025 at 04:39:58PM -0500, Jason Merrill wrote:
> On 1/21/25 11:15 AM, Jakub Jelinek wrote:
> > On Tue, Jan 21, 2025 at 11:06:35AM -0500, Jason Merrill wrote:
> > > > --- gcc/c-family/c-common.cc.jj 2025-01-20 18:00:35.667875671 +0100
> > > > +++ gcc/c-family/c-common.cc2025-01-2
On 1/15/25 7:36 PM, yxj-github-437 wrote:
On Fri, Jan 03, 2025 at 05:18:55PM +, xxx wrote:
From: yxj-github-437 <2457369...@qq.com>
This patch attempts to fix an error when build module std. The reason for the
error is __builrin_va_list (aka struct __va_list) is an internal linkage. so
atte
On 1/16/25 2:02 PM, Patrick Palka wrote:
On Mon, 13 Jan 2025, Jason Merrill wrote:
On 1/10/25 2:20 PM, Patrick Palka wrote:
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
OK for trunk?
The documentation for LAMBDA_EXPR_THIS_CAPTURE seems outdated because
it says the field i
On 1/3/25 3:00 PM, Simon Martin wrote:
We currently accept this code with c++ <= 17 even though it's invalid
since the base is not initialized (we properly reject it with c++ >= 20)
=== cut here ===
struct NoMut1 { int a, b; };
struct NoMut3 : NoMut1 {
constexpr NoMut3(int a, int b) {}
};
voi
Am Dienstag, dem 21.01.2025 um 21:13 + schrieb Joseph Myers:
> On Tue, 21 Jan 2025, Martin Uecker wrote:
>
> > The bigger issue seems that if you forward reference a member, you
> > do not yet know its type. So whatever syntax we pick, general expressions
> > seem problematic anyway:
> >
> >
On 1/21/25 1:02 PM, Jakub Jelinek wrote:
On Tue, Jan 21, 2025 at 06:47:53PM +0100, Jakub Jelinek wrote:
Indeed, I've just used what it was doing without thinking too much about it,
sorry.
addl_args = tree_cons (NULL_TREE, arg, addl_args);
with addl_args = nreverse (addl_args); after the loop mig
On 1/21/25 11:15 AM, Jakub Jelinek wrote:
On Tue, Jan 21, 2025 at 11:06:35AM -0500, Jason Merrill wrote:
--- gcc/c-family/c-common.cc.jj 2025-01-20 18:00:35.667875671 +0100
+++ gcc/c-family/c-common.cc2025-01-21 09:29:23.955582581 +0100
@@ -9010,33 +9010,46 @@ make_tree_vector_from_list (tre
On Tue, 21 Jan 2025, Martin Uecker wrote:
> The bigger issue seems that if you forward reference a member, you
> do not yet know its type. So whatever syntax we pick, general expressions
> seem problematic anyway:
>
> struct {
> char *buf [[counted_by(2 * .n + 3)]];
> unsigned int n;
That's
Am Dienstag, dem 21.01.2025 um 21:15 +0100 schrieb Martin Uecker:
> Am Dienstag, dem 21.01.2025 um 19:45 + schrieb Joseph Myers:
> > On Tue, 21 Jan 2025, Martin Uecker wrote:
> >
> > > Coudn't you use the rule that .len refers to the closest enclosing
> > > structure
> > > even without __self
> On 20 Jan 2025, at 18:33, Andrew Carlotti wrote:
>
> On Mon, Jan 20, 2025 at 06:29:12PM +, Tamar Christina wrote:
>>> -Original Message-
>>> From: Iain Sandoe
>>> Sent: Monday, January 20, 2025 6:15 PM
>>> To: Andrew Carlotti
>>> Cc: Kyrylo Tkachov ; GCC Patches >> patc...@gcc.
Am Dienstag, dem 21.01.2025 um 19:45 + schrieb Joseph Myers:
> On Tue, 21 Jan 2025, Martin Uecker wrote:
>
> > Coudn't you use the rule that .len refers to the closest enclosing structure
> > even without __self__ ? This would then also disambiguate between
> > designators
> > and other uses
Dimitar Dimitrov writes:
> Test case is valid even if size of int is more than 32 bits.
>
> Pushed to trunk as obvious.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/torture/pr117546.c: Require effective target int32plus.
>
> Cc: Georg-Johann Lay
> Cc: Sam James
> Signed-off-by: Dimitar Dimit
On Tue, 21 Jan 2025, Martin Uecker wrote:
> Coudn't you use the rule that .len refers to the closest enclosing structure
> even without __self__ ? This would then also disambiguate between designators
> and other uses.
Right now, an expression cannot start with '.', which provides the
disambigu
Test case is valid even if size of int is more than 32 bits.
Pushed to trunk as obvious.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr117546.c: Require effective target int32plus.
Cc: Georg-Johann Lay
Cc: Sam James
Signed-off-by: Dimitar Dimitrov
---
gcc/testsuite/gcc.dg/torture/pr11
On Tue, Jan 21, 2025 at 05:15:17PM +0100, Jakub Jelinek wrote:
> On Tue, Jan 21, 2025 at 11:06:35AM -0500, Jason Merrill wrote:
> > > --- gcc/c-family/c-common.cc.jj 2025-01-20 18:00:35.667875671 +0100
> > > +++ gcc/c-family/c-common.cc 2025-01-21 09:29:23.955582581 +0100
> > > @@ -9010,33 +
Am Dienstag, dem 21.01.2025 um 18:40 + schrieb Joseph Myers:
> On Tue, 21 Jan 2025, Qing Zhao wrote:
>
> > So, even after we introduce the designator syntax for counted_by attribute,
> > arbitrary expressions as:
> >
> > counted_by (.len1 + const)
> > counted_by (.len1 + .len2)
> >
> > St
Hi,
This patch adds a MIPS64 implementation of `fiber_switchContext',
replacing the generic implementation. The `core.thread.fiber' module
already defines version=AsmExternal on mips64el-linux-gnuabi64 targets.
Committed to mainline.
Regards,
Iain.
---
PR d/118584
libphobos/ChangeLog:
On Tue, 21 Jan 2025, Vineet Gupta wrote:
> Silly question, what exactly is the procedure calling convention rule for
> FCSR/FRM ? Is it a Caller saved or a Callee saved Reg.
> The psABI CC doc is not explicit in those terms at least [1]
>
> | "The Floating-Point Control and Status Register (fcs
On Tue, 21 Jan 2025, Qing Zhao wrote:
> > On Jan 20, 2025, at 16:19, Joseph Myers wrote:
> >
> > On Sat, 18 Jan 2025, Kees Cook wrote:
> >
> >> Gaining access to global variables is another gap Linux has -- e.g. we
> >> have arrays that are sized by the global number-of-cpus variable. :)
> >
>
On Tue, 21 Jan 2025, Qing Zhao wrote:
> So, even after we introduce the designator syntax for counted_by attribute,
> arbitrary expressions as:
>
> counted_by (.len1 + const)
> counted_by (.len1 + .len2)
>
> Still cannot be supported?
Indeed. Attempting to use ".len1" inside something that
On 1/20/25 19:07, Li, Pan2 wrote:
> Agree, the mode-switch will take care of the frm when meet a call (covered by
> testcase already).
>
>5 │
>6 │ extern size_t normalize_vl_1 (size_t vl);
>7 │ extern size_t normalize_vl_2 (size_t vl);
>8 │
>9 │ vfloat32m1_t
> 10
Hi,
This patch was committed some time ago in r14-10036, now it's being
backported to the gcc-13 and gcc-12 release branches.
The ICE in the D front-end was found to be caused by in some cases the
hidden closure parameter type being generated too early for nested
functions. Better to update the
On Tue, Jan 21, 2025 at 05:35:02PM +0100, Jakub Jelinek wrote:
> On Tue, Jan 21, 2025 at 05:15:17PM +0100, Jakub Jelinek wrote:
> > On Tue, Jan 21, 2025 at 11:06:35AM -0500, Jason Merrill wrote:
> > > > --- gcc/c-family/c-common.cc.jj 2025-01-20 18:00:35.667875671 +0100
> > > > +++ gcc/c-family/c-c
On Tue, Jan 21, 2025 at 9:55 AM Jeff Law wrote:
>
>
>
> On 1/20/25 9:38 PM, Andrew Pinski wrote:
> > In a similar way find_split_point handles `a+b*C`, this adds
> > the split point for `~a & b`. This allows for better instruction
> > selection when the target has this instruction (aarch64, arm a
On Tue, Jan 21, 2025 at 06:47:53PM +0100, Jakub Jelinek wrote:
> Indeed, I've just used what it was doing without thinking too much about it,
> sorry.
> addl_args = tree_cons (NULL_TREE, arg, addl_args);
> with addl_args = nreverse (addl_args); after the loop might be better,
> can test that increm
On 1/20/25 9:38 PM, Andrew Pinski wrote:
In a similar way find_split_point handles `a+b*C`, this adds
the split point for `~a & b`. This allows for better instruction
selection when the target has this instruction (aarch64, arm and x86_64
are examples which have this).
Built and tested for a
On 1/21/25 6:11 AM, Jin Ma wrote:
Although we have handled the vl of XTheadVector correctly in the
expand phase and predicates, the results show that the work is
still insufficient.
In the curr_insn_transform function, the insn is transformed from:
(insn 69 67 225 12 (set (mem:RVVM8SF (reg/f:
On Tue, Jan 21, 2025 at 12:04:36PM -0500, Jason Merrill wrote:
> > --- gcc/cp/parser.cc.jj 2025-01-17 19:27:34.052140136 +0100
> > +++ gcc/cp/parser.cc2025-01-20 20:16:23.082876036 +0100
> > @@ -36632,14 +36632,22 @@ cp_parser_objc_message_args (cp_parser*
> > /* Handle non-selector
On 1/21/25 10:15 AM, Robin Dapp wrote:
I'm going to push the attached as obvious if my local test shows
no issues.
Yea, please do. Thanks.
jeff
On 1/21/25 5:52 AM, Jin Ma wrote:
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/rvv.exp: Enable testsuite of
XTheadVector.
* gcc.target/riscv/rvv/xtheadvector/pr114194.c: Adjust correctly.
* gcc.target/riscv/rvv/xtheadvector/prefix.c: Likewise.
* gcc.
Georg-Johann Lay writes:
> u16 << 5 and u16 << 6 can be tweaked by using MUL instructions.
> Benefit is a better speed ratio with -Os and smaller size with -O2.
>
> No new regressions.
>
> Ok for trunk?
Ok. Please apply.
Denis.
Richard Sandiford writes:
> Denis Chertykov writes:
>> PR rtl-optimization/117868
>> gcc/
>> * lra-spills.cc (assign_stack_slot_num_and_sort_pseudos): Reuse slots
>> only without allocated memory or only with equal or smaller registers
>> with equal or smaller alignment.
>>
On Fri, Oct 18, 2024 at 11:52:26AM +0530, Tejas Belagod wrote:
> +/* This worksharing construct binds to an implicit outer parallel region in
> +whose scope va is declared and therefore is default private. This causes
> +the lastprivate clause list item va to be diagnosed as private in the
On Fri, Oct 18, 2024 at 11:52:25AM +0530, Tejas Belagod wrote:
> The target clause in OpenMP is used to offload loop kernels to accelarator
> peripeherals. target's 'map' clause is used to move data from and to the
> accelarator. When the data is SVE type, it may not be suitable because of
> vari
I'm going to push the attached as obvious if my local test shows
no issues.
Regards
Robin
[PATCH] RISC-V: Unbreak bootstrap.
This fixes a wrong format specifier and an unused variable which should
re-enable bootstrap.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_file_end): Fix format
On Fri, Oct 18, 2024 at 11:52:23AM +0530, Tejas Belagod wrote:
> This patch adds a test scaffold for OpenMP compile tests in under the
> gcc.target
> testsuite. It also adds a target tests directory libgomp.target along with an
> SVE execution test
>
> gcc/testsuite/ChangeLog:
>
> * gcc.t
Hi,
On Tue, 2025-01-21 at 14:46 +0100, Mark Wielaard wrote:
> Thanks. And if you need help with that please let people know.
> The riscv bootstrap has been broken now for 5 days.
> And it really looks like it is as simple as just removing that one
> line.
Sorry, I missed that you already pushed t
On 1/21/25 10:51 AM, Jakub Jelinek wrote:
Hi!
As the following testcases show, I forgot to handle CPP_EMBED in
cp_parser_objc_message_args which is another place which can parse
possibly long valid lists of CPP_COMMA separated CPP_NUMBER tokens.
Bootstrapped/regtested on x86_64-linux and i686-l
On Fri, Oct 18, 2024 at 11:52:22AM +0530, Tejas Belagod wrote:
> Currently poly-int type structures are passed by value to OpenMP runtime
> functions for shared clauses etc. This patch improves on this by passing
> around poly-int structures by address to avoid copy-overhead.
>
> gcc/ChangeLog
>
On Tue, Jan 21, 2025 at 11:06:35AM -0500, Jason Merrill wrote:
> > --- gcc/c-family/c-common.cc.jj 2025-01-20 18:00:35.667875671 +0100
> > +++ gcc/c-family/c-common.cc2025-01-21 09:29:23.955582581 +0100
> > @@ -9010,33 +9010,46 @@ make_tree_vector_from_list (tree list)
> > return re
On Tue, Jan 21, 2025 at 04:28:59PM +0100, Georg-Johann Lay wrote:
> Am 18.01.25 um 19:30 schrieb Dimitar Dimitrov:
> > This test fails on AVR.
> >
> > Debugging the test on x86 host, I noticed that u in function s sometimes
> > has value 16128. The "t <= 3 * u" expression in the same function
> >
On 1/21/25 10:52 AM, Jakub Jelinek wrote:
On Mon, Jan 20, 2025 at 05:14:33PM -0500, Jason Merrill wrote:
--- gcc/cp/call.cc.jj 2025-01-15 18:24:36.135503866 +0100
+++ gcc/cp/call.cc 2025-01-17 14:42:38.201643385 +0100
@@ -4258,11 +4258,30 @@ add_list_candidates (tree fns, tree firs
/
Ping.
On Fri, Jan 10, 2025 at 03:07:52PM -0500, Marek Polacek wrote:
> Ping.
>
> On Fri, Dec 20, 2024 at 08:58:05AM -0500, Marek Polacek wrote:
> > Ping.
> >
> > On Tue, Nov 26, 2024 at 05:35:50PM -0500, Marek Polacek wrote:
> > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> >
On 1/21/25 9:54 AM, Jason Merrill wrote:
On 1/20/25 5:58 PM, Marek Polacek wrote:
On Mon, Jan 20, 2025 at 12:39:03PM -0500, Jason Merrill wrote:
On 1/20/25 12:27 PM, Marek Polacek wrote:
On Mon, Jan 20, 2025 at 11:46:44AM -0500, Jason Merrill wrote:
On 1/20/25 10:27 AM, Marek Polacek wrote:
On Tue, Jan 07, 2025 at 01:49:04PM +0100, Jakub Jelinek wrote:
> On Wed, Dec 18, 2024 at 12:15:15PM +0100, Jakub Jelinek wrote:
> > On Fri, Dec 06, 2024 at 05:07:40PM +0100, Jakub Jelinek wrote:
> > > I'd like to ping the
> > > https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668699.html
> >
On Mon, Jan 20, 2025 at 05:15:51PM -0500, Vladimir Makarov wrote:
> The patch fixes
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118560
The fix for this PR has been committed without a testcase.
The following testcase would take at least 15 minutes to compile
on a fast machine (powerpc64-linux
Hi!
On top of the
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662507.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662750.html
patches (where the first one implements CWG2867 for block scope static
or thread_local structured bindings and the latter for namespace scope
On Mon, Jan 20, 2025 at 05:14:33PM -0500, Jason Merrill wrote:
> > --- gcc/cp/call.cc.jj 2025-01-15 18:24:36.135503866 +0100
> > +++ gcc/cp/call.cc 2025-01-17 14:42:38.201643385 +0100
> > @@ -4258,11 +4258,30 @@ add_list_candidates (tree fns, tree firs
> > /* Expand the CONSTRUCTOR into
Hi!
As the following testcases show, I forgot to handle CPP_EMBED in
cp_parser_objc_message_args which is another place which can parse
possibly long valid lists of CPP_COMMA separated CPP_NUMBER tokens.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
2025-01-20 Jakub Jelin
On Tue, 2025-01-21 at 23:18 +0800, Xi Ruoyao wrote:
> On Tue, 2025-01-21 at 22:14 +0800, Xi Ruoyao wrote:
> > > > in GCC 13 the result is:
> > > >
> > > > or $r12,$r4,$r0
> > >
> > > Hmm, this strange move is caused by "&" in bstrpick_alsl_paired.
> > > Is it
> > > really needed for
Am 18.01.25 um 19:30 schrieb Dimitar Dimitrov:
This test fails on AVR.
Debugging the test on x86 host, I noticed that u in function s sometimes
has value 16128. The "t <= 3 * u" expression in the same function
results in signed integer overflow for targets with sizeof(int)=16.
Fix by requiring
On Tue, 2025-01-21 at 22:14 +0800, Xi Ruoyao wrote:
> > > in GCC 13 the result is:
> > >
> > > or $r12,$r4,$r0
> >
> > Hmm, this strange move is caused by "&" in bstrpick_alsl_paired. Is it
> > really needed for the fusion?
>
> Never mind, it's needed or a = ((a & 0x) << 1) + a w
> On Jan 20, 2025, at 16:19, Joseph Myers wrote:
>
> On Sat, 18 Jan 2025, Kees Cook wrote:
>
>> Gaining access to global variables is another gap Linux has -- e.g. we
>> have arrays that are sized by the global number-of-cpus variable. :)
>
> Note that it's already defined that counted_by ta
> On Jan 17, 2025, at 18:13, Joseph Myers wrote:
>
> On Fri, 17 Jan 2025, Qing Zhao wrote:
>
>> struct fc_bulk {
>> ...
>> struct fs_bulk fs_bulk;
>> struct fc fcs[] __counted_by(fs_bulk.len);
>> };
>>
>> i.e, the “counted_by” field is in the inner structure of the current
>> structure of
On 1/20/25 5:58 PM, Marek Polacek wrote:
On Mon, Jan 20, 2025 at 12:39:03PM -0500, Jason Merrill wrote:
On 1/20/25 12:27 PM, Marek Polacek wrote:
On Mon, Jan 20, 2025 at 11:46:44AM -0500, Jason Merrill wrote:
On 1/20/25 10:27 AM, Marek Polacek wrote:
On Fri, Jan 17, 2025 at 06:38:45PM -0500,
Hi!
On 2025-01-20T08:40:25+, Tamar Christina wrote:
>> From: Thomas Schwinge
>> Sent: Monday, January 13, 2025 9:54 AM
>> On 2025-01-10T21:22:03+, Tamar Christina via Gcc-cvs > c...@gcc.gnu.org> wrote:
>> > https://gcc.gnu.org/g:68326d5d1a593dc0bf098c03aac25916168bc5a9
>> >
>> > commit
On Tue, 2025-01-21 at 22:14 +0800, Xi Ruoyao wrote:
> On Tue, 2025-01-21 at 21:52 +0800, Xi Ruoyao wrote:
> > > struct Pair { unsigned long a, b; };
> > >
> > > struct Pair
> > > test (struct Pair p, long x, long y)
> > > {
> > > p.a &= 0x;
> > > p.a <<= 2;
> > > p.a += x;
> > > p.
On Tue, 2025-01-21 at 21:52 +0800, Xi Ruoyao wrote:
> > struct Pair { unsigned long a, b; };
> >
> > struct Pair
> > test (struct Pair p, long x, long y)
> > {
> > p.a &= 0x;
> > p.a <<= 2;
> > p.a += x;
> > p.b &= 0x;
> > p.b <<= 2;
> > p.b += x;
> > return p;
> > }
On Tue, Jan 21, 2025 at 10:52 AM Jakub Jelinek wrote:
>
> On Tue, Jan 21, 2025 at 06:31:43AM -0300, Alexandre Oliva wrote:
> > On Jan 21, 2025, Richard Biener wrote:
> >
> > > you can use bit_field_size () and bit_field_offset () unconditionally,
> >
> > Nice, thanks!
> >
> > > Now, we don't have
There are calls to dr_misalignment left that do not correct for the
offset (which is vector type dependent) when the stride is negative.
Notably vect_known_alignment_in_bytes doesn't allow to pass through
such offset which the following adds (computing the offset in
vect_known_alignment_in_bytes wo
On Tue, 2025-01-21 at 21:23 +0800, Xi Ruoyao wrote:
/* snip */
> > It seems to be more formal through TARGET_SCHED_MACRO_FUSION_P and
> >
> > TARGET_SCHED_MACRO_FUSION_PAIR_P. I found the spec test item that
> > generated
> >
> > this instruction pair. I implemented these two hooks to see if it
Hi!
On 2025-01-16T15:57:52+0100, I wrote:
> I have noticed that '-fdump-tree-original-lineno' for Fortran (for
> example) does dump location information, but for C/C++ it does not.
> The reason is that Fortran (and other front ends) use code like:
>
> /* Output the GENERIC tree. */
> dump
This updates aarch64.opt.urls after my patch earlier today.
Pushing directly as it;s an obvious fix.
gcc/ChangeLog:
* config/aarch64/aarch64.opt.urls: Regenerate
---
gcc/config/aarch64/aarch64.opt.urls | 3 +++
1 file changed, 3 insertions(+)
diff --git a/gcc/config/aarch64/aarch64.op
Hi,
On Sat, 2025-01-18 at 09:34 +0800, Monk Chiang wrote:
> Thanks, I will fix it.
Thanks. And if you need help with that please let people know.
The riscv bootstrap has been broken now for 5 days.
And it really looks like it is as simple as just removing that one
line.
Cheers,
Mark
>
> > Mar
On Tue, 2025-01-21 at 20:34 +0800, Lulu Cheng wrote:
>
> 在 2025/1/21 下午6:05, Xi Ruoyao 写道:
> > On Tue, 2025-01-21 at 16:41 +0800, Lulu Cheng wrote:
> > > 在 2025/1/21 下午12:59, Xi Ruoyao 写道:
> > > > On Tue, 2025-01-21 at 11:46 +0800, Lulu Cheng wrote:
> > > > > 在 2025/1/18 下午7:33, Xi Ruoyao 写道:
> >
Denis Chertykov writes:
> PR rtl-optimization/117868
> gcc/
> * lra-spills.cc (assign_stack_slot_num_and_sort_pseudos): Reuse slots
> only without allocated memory or only with equal or smaller registers
> with equal or smaller alignment.
> (lra_spill): Print slot siz
Although we have handled the vl of XTheadVector correctly in the
expand phase and predicates, the results show that the work is
still insufficient.
In the curr_insn_transform function, the insn is transformed from:
(insn 69 67 225 12 (set (mem:RVVM8SF (reg/f:DI 218 [ _77 ]) [0 S[128, 128]
A32])
LGTM but defer to GCC 16 :)
On Tue, Jan 21, 2025 at 11:43 AM wrote:
>
> From: yulong
>
> This patch implements the Sifvie vendor extension Xsfvcp[1]
> support to gcc. Providing a flexible mechanism to extend application
> processors with custom coprocessors and variable-latency arithmetic
>
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/rvv.exp: Enable testsuite of
XTheadVector.
* gcc.target/riscv/rvv/xtheadvector/pr114194.c: Adjust correctly.
* gcc.target/riscv/rvv/xtheadvector/prefix.c: Likewise.
* gcc.target/riscv/rvv/xtheadvector/vlb-vsb.c
The following amends the previous fix to mark all of the loop BBs
as need to be scanned for new LC PHI uses when its nesting parents
changed, noticing one caller of fix_loop_placement was already
doing that. So the following moves this code into fix_loop_placement,
covering both callers now.
Boot
On 1/21/2025 11:37 AM, Richard Sandiford wrote:
Thanks for the update. LGTM with one trivial fix:
writes:
diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc
b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc
index ca721dd2c09..d8776a55230 100644
--- a/gcc/config/aarch64/aarch64-
Hi Jason,
On 20 Jan 2025, at 22:50, Jason Merrill wrote:
> On 1/4/25 10:13 AM, Simon Martin wrote:
>> The invalid case in this PR trips on an assertion in
>> build_class_member_access_expr that build_base_path would never
>> return
>> an error_mark_node, which is actually incorrect if the object
在 2025/1/21 下午6:05, Xi Ruoyao 写道:
On Tue, 2025-01-21 at 16:41 +0800, Lulu Cheng wrote:
在 2025/1/21 下午12:59, Xi Ruoyao 写道:
On Tue, 2025-01-21 at 11:46 +0800, Lulu Cheng wrote:
在 2025/1/18 下午7:33, Xi Ruoyao 写道:
/* snip */
;; This code iterator allows unsigned and signed division to be gen
Thanks for the update. LGTM with one trivial fix:
writes:
> diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc
> b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc
> index ca721dd2c09..d8776a55230 100644
> --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc
> +++ b/gcc/config/a
On Tue, 2025-01-21 at 16:41 +0800, Lulu Cheng wrote:
>
> 在 2025/1/21 下午12:59, Xi Ruoyao 写道:
> > On Tue, 2025-01-21 at 11:46 +0800, Lulu Cheng wrote:
> > > 在 2025/1/18 下午7:33, Xi Ruoyao 写道:
> > > /* snip */
> > > > ;; This code iterator allows unsigned and signed division to be
> > > > generate
This patch introduces support for LUTI2/LUTI4 ACLE for SVE2.
LUTI instructions are used for efficient table lookups with 2-bit
or 4-bit indices. LUTI2 reads indexed 8-bit or 16-bit elements from
the low 128 bits of the table vector using packed 2-bit indices,
while LUTI4 can read from the low 128
On Tue, Jan 21, 2025 at 06:31:43AM -0300, Alexandre Oliva wrote:
> On Jan 21, 2025, Richard Biener wrote:
>
> > you can use bit_field_size () and bit_field_offset () unconditionally,
>
> Nice, thanks!
>
> > Now, we don't have the same handling on BIT_FIELD_REFs but it
> > seems it's enough to a
Pushed to r15-7092 and r15-7093.
在 2025/1/20 下午5:54, Lulu Cheng 写道:
Currently, the following items are supported:
__attribute__ ((target ("{no-}strict-align")))
__attribute__ ((target ("cmodel=")))
__attribute__ ((target ("arch=")))
__attribute__ ((target ("t
1 - 100 of 118 matches
Mail list logo