erity: enhancement
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: roger at nextmovesoftware dot com
Target Milestone: ---
Target: nvptx-*-*
nvptx supports instructions for integer addition and subtraction optio
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120296
Roger Sayle changed:
What|Removed |Added
CC||roger at nextmovesoftware dot
com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115024
Roger Sayle changed:
What|Removed |Added
CC||roger at nextmovesoftware dot
com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118608
Roger Sayle changed:
What|Removed |Added
CC||roger at nextmovesoftware dot
com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117012
Roger Sayle changed:
What|Removed |Added
CC||roger at nextmovesoftware dot
com
||roger at nextmovesoftware dot
com
Ever confirmed|0 |1
Last reconfirmed||2024-09-09
--- Comment #2 from Roger Sayle ---
The constant ~0 can be materialized on x86 in only three bytes using either of
the sequences "pu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116275
--- Comment #4 from Roger Sayle ---
Created attachment 58868
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58868&action=edit
proposed patch
Here's my proposed fix (the first of two patches) that resolves the ICE with
the testcase. The p
|1
Assignee|unassigned at gcc dot gnu.org |roger at
nextmovesoftware dot com
Last reconfirmed||2024-08-07
Target Milestone|--- |15.0
--- Comment #1 from Roger Sayle ---
Doh! This is almost certainly caused
|RESOLVED
CC||roger at nextmovesoftware dot
com
Resolution|--- |FIXED
Known to work||15.0
--- Comment #4 from Roger Sayle ---
This should now be fixed/implemented on
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115751
Roger Sayle changed:
What|Removed |Added
Known to work||15.0
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115749
--- Comment #12 from Roger Sayle ---
I owe Kim an apology. It does appear that modern x86_64 processors perform
(many) multiplications faster than the latencies given in the Intel/AMD/Agner
Fog documentation.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115756
Roger Sayle changed:
What|Removed |Added
CC||roger at nextmovesoftware dot
com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115751
--- Comment #4 from Roger Sayle ---
Created attachment 58567
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58567&action=edit
proposed patch
Here's my proposed patch.
|1
Assignee|unassigned at gcc dot gnu.org |roger at
nextmovesoftware dot com
CC||roger at nextmovesoftware dot
com
Status|UNCONFIRMED |ASSIGNED
--- Comment #3 from Roger Sayle ---
Doh
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113673
Roger Sayle changed:
What|Removed |Added
Assignee|roger at nextmovesoftware dot com |unassigned at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109618
Roger Sayle changed:
What|Removed |Added
CC||roger at nextmovesoftware dot
com
|UNCONFIRMED |NEW
CC||roger at nextmovesoftware dot
com
Ever confirmed|0 |1
--- Comment #3 from Roger Sayle ---
Doh! I hadn't noticed (twenty years ago) that -1 was used to represent an
invalid quantity numb
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115489
Roger Sayle changed:
What|Removed |Added
Component|c |tree-optimization
--- Comment #3 from Rog
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115021
Roger Sayle changed:
What|Removed |Added
Summary|[14/15 regression] |[14 regression] unnecessary
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115397
Roger Sayle changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115478
--- Comment #3 from Roger Sayle ---
Hi Jeff, many thanks for looking into this/assigning the PR to yourself.
I'd suggest that the fix is to add a define_code_iterator to aarch64.md
called any_or_plus matching the definition in i386.md.
(define_c
||2024-06-08
Assignee|unassigned at gcc dot gnu.org |roger at
nextmovesoftware dot com
Status|UNCONFIRMED |ASSIGNED
--- Comment #4 from Roger Sayle ---
Created attachment 58386
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58386&
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115351
Roger Sayle changed:
What|Removed |Added
Assignee|roger at nextmovesoftware dot com |unassigned at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115351
Roger Sayle changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |roger at
nextmovesoftware dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115161
Roger Sayle changed:
What|Removed |Added
Ever confirmed|0 |1
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106060
Roger Sayle changed:
What|Removed |Added
Resolution|--- |FIXED
Known to work|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115021
--- Comment #2 from Roger Sayle ---
Here's a reduced test case that should be unaffected by the pending changes to
how V8QI shifts are expanded. Note that the final "t -= t4" is required to
convince the register allocator to "spill".
typedef s
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115021
Roger Sayle changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |roger at
nextmovesoftware dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78947
Roger Sayle changed:
What|Removed |Added
CC||roger at nextmovesoftware dot
com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85559
Bug 85559 depends on bug 78947, which changed state.
Bug 78947 Summary: sub-optimal code for (bool)(int ? int : int)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78947
What|Removed |Added
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113832
--- Comment #5 from Roger Sayle ---
I'm trying to confirm that there are actually widening multiplications in
464.h264ref (on aarch64), but if anyone's already done an analysis of what
might be causing these performance swings, please do post (a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113673
Roger Sayle changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |roger at
nextmovesoftware dot com
||roger at nextmovesoftware dot
com
Status|NEW |RESOLVED
Target Milestone|--- |14.0
--- Comment #6 from Roger Sayle ---
This is now fixed on mainline (for GCC 14 and GCC 15).
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97756
Roger Sayle changed:
What|Removed |Added
Known to work||14.0
Summary|[11/12/13/14/15 Re
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111701
Roger Sayle changed:
What|Removed |Added
CC||roger at nextmovesoftware dot
com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114767
--- Comment #5 from Roger Sayle ---
Another interesting (simpler) case of -ffast-math pessimization is:
void foo(_Complex double *c)
{
for (int i=0; i<16; i++)
c[i] += __builtin_complex(1.0,0.0);
}
Again without -ffast-math we vectori
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114767
Roger Sayle changed:
What|Removed |Added
CC||roger at nextmovesoftware dot
com
|UNCONFIRMED |NEW
Ever confirmed|0 |1
CC||roger at nextmovesoftware dot
com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114552
Roger Sayle changed:
What|Removed |Added
CC||roger at nextmovesoftware dot
com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114284
Roger Sayle changed:
What|Removed |Added
CC||roger at nextmovesoftware dot
com
at gcc dot gnu.org |roger at
nextmovesoftware dot com
--- Comment #4 from Roger Sayle ---
Created attachment 57587
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57587&action=edit
proposed patch
Proposed fix attached. Currently bootstrapping and regression testing. The
prob
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114187
Roger Sayle changed:
What|Removed |Added
Last reconfirmed||2024-03-01
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113336
Roger Sayle changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106060
Roger Sayle changed:
What|Removed |Added
Target Milestone|--- |15.0
--- Comment #5 from Roger Sayle ---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111267
Roger Sayle changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113690
Roger Sayle changed:
What|Removed |Added
Summary|[13/14 Regression] ICE: in |[13 Regression] ICE: in
|1
CC||roger at nextmovesoftware dot
com
Last reconfirmed||2024-02-15
--- Comment #2 from Roger Sayle ---
The issue appears to be with (poor costing in) loop invariant store motion.
Adding the command line
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113764
Roger Sayle changed:
What|Removed |Added
Summary|[X86] Generates lzcnt when |[X86] __builtin_clz
|bs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113764
--- Comment #2 from Roger Sayle ---
Investigating further, the thinking behind GCC's current behaviour can be found
in Agner Fog's instruction tables; on many architectures BSR is much slower
than LZCNT.
Legacy AMD: BSR=4 cycles, LZCNT=2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113673
--- Comment #4 from Roger Sayle ---
The identified patch implements += the same way as |=. Presumably a version of
the test case replacing "m += *data++;" with "m |= *data++;" would be more
useful at identifying a patch that actually changed EH
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113832
Roger Sayle changed:
What|Removed |Added
CC||roger at nextmovesoftware dot
com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113764
Roger Sayle changed:
What|Removed |Added
CC||roger at nextmovesoftware dot
com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113759
--- Comment #9 from Roger Sayle ---
Many thanks Jakub. Sorry again for the inconvenience.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113720
Roger Sayle changed:
What|Removed |Added
CC||roger at nextmovesoftware dot
com
at gcc dot gnu.org |roger at
nextmovesoftware dot com
--- Comment #4 from Roger Sayle ---
I'm bootstrapping and regression testing a fix.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113701
Roger Sayle changed:
What|Removed |Added
See Also||https://gcc.gnu.org/bugzill
at gcc dot gnu.org |roger at
nextmovesoftware dot com
Target Milestone|--- |14.0
--- Comment #7 from Roger Sayle ---
A revised patch has been posted for review/approval to gcc-patches:
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/644147.html
at gcc dot gnu.org |roger at
nextmovesoftware dot com
--- Comment #7 from Roger Sayle ---
I'm bootstrapping and regression testing a patch.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113533
--- Comment #14 from Roger Sayle ---
My apologies for not keeping folks updated on my thinking. Following Oleg's
feedback, I've decided to slim down my proposed fix to the bare minimum, and
postpone the other rtx_costs improvements until GCC 15
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113533
Roger Sayle changed:
What|Removed |Added
See Also||https://gcc.gnu.org/bugzill
nextmovesoftware dot com |unassigned at gcc dot
gnu.org
Summary|libatomic (testsuite) |libatomic (testsuite)
|regressions on |regressions on arm
|armv6-linux-gnueabihf |
--- Comment #4 from Roger Sayle ---
Hi Victor,
Yes, I agree
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113560
--- Comment #6 from Roger Sayle ---
In the .optimized dump, we have:
__int128 unsigned __res;
__int128 unsigned _12;
...
__res_11 = in_2(D) w* 184467440738;
_12 = __res_11 & 18446744073709551615;
__res_7 = _12 * 100;
So the first mu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113560
Roger Sayle changed:
What|Removed |Added
CC||roger at nextmovesoftware dot
com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113533
--- Comment #10 from Roger Sayle ---
Hi Oleg. Great question. The "speed" parameter passed to rtx_costs, and
address_cost indicates whether the middle-end is optimizing for peformance, and
interested in the nummber of cycles taken by each inst
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113533
--- Comment #8 from Roger Sayle ---
Created attachment 57190
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57190&action=edit
proposed patch
Proposed patch to provide a sane/saner set of rtx_costs for SH. There's plenty
more that could b
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: roger at nextmovesoftware dot com
Target Milestone: ---
This patch is a placeholder for tracking the reported failures of
FAIL: gcc.target/arm/bics_3.c scan-assembler-times
|UNCONFIRMED |NEW
CC||roger at nextmovesoftware dot
com
Ever confirmed|0 |1
--- Comment #6 from Roger Sayle ---
To help diagnose the problem, I came up with this simple patch:
diff --git a/gcc/fwprop.cc b/gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91681
Roger Sayle changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111267
--- Comment #10 from Roger Sayle ---
A revised and improved patch has been posted for review at
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643062.html
|UNCONFIRMED |ASSIGNED
Ever confirmed|0 |1
Assignee|unassigned at gcc dot gnu.org |roger at
nextmovesoftware dot com
--- Comment #1 from Roger Sayle ---
As there's a patch for this regression (awaiting review), I should upgrade the
PR stat
at gcc dot gnu.org |roger at
nextmovesoftware dot com
--- Comment #8 from Roger Sayle ---
Now we're in stage4, I'll take this. I'm bootstrapping and regression testing
a variant of my tweak to try_fwprop_subst_pattern. The change in comment #7
can hang loop inde
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106060
Roger Sayle changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |roger at
nextmovesoftware dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112992
Roger Sayle changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111267
--- Comment #7 from Roger Sayle ---
Very many thanks to Jeff Law for pointing me to fwprop. The following simple
patch also fixes this regression.
diff --git a/gcc/fwprop.cc b/gcc/fwprop.cc
index 0c588f8..cbba44e 100644
--- a/gcc/fwprop.cc
+++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111267
--- Comment #6 from Roger Sayle ---
Sorry for the delay in replying/answering Jakub's questions/comments. Yes,
using a define_insn_and_split in the backend fixes/works around the issue (and
I agree your implementation/refinement in comment #5 i
Priority: P3
Component: other
Assignee: unassigned at gcc dot gnu.org
Reporter: roger at nextmovesoftware dot com
Target Milestone: ---
As suggested by Richard Earnshaw, this opens a bugzilla PR for tracking this
issue. All the tests in libatomic currently fail on a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113231
Roger Sayle changed:
What|Removed |Added
Resolution|--- |FIXED
Target Milestone|---
at gcc dot gnu.org |roger at
nextmovesoftware dot com
--- Comment #4 from Roger Sayle ---
I'm testing a patch, for more accurate conversion gains/costs in the
scalar-to-vector pass. Adding -mno-stv will work around the problem.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914
--- Comment #19 from Roger Sayle ---
Created attachment 56930
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56930&action=edit
proposed patch
And now for a patch that does (or should) work. This even contains an
optimization, we middle-e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914
--- Comment #18 from Roger Sayle ---
Please ignore comment #17, the above patch is completely bogus/broken.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914
--- Comment #17 from Roger Sayle ---
I think this patch might resolve the problem (or move it somewhere else):
diff --git a/gcc/expr.cc b/gcc/expr.cc
index 9fef2bf6585..218bca905f5 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -6274,10 +6274,7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112380
Roger Sayle changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: roger at nextmovesoftware dot com
Target Milestone: ---
The following four functions should in theory all produce the same code:
typedef unsigned long long v4di __attribute
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112380
--- Comment #12 from Roger Sayle ---
Patch proposed (actually two alternatives proposed) at
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636203.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110551
Roger Sayle changed:
What|Removed |Added
Target Milestone|11.5|14.0
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91865
Roger Sayle changed:
What|Removed |Added
CC||roger at nextmovesoftware dot
com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112380
Roger Sayle changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |roger at
nextmovesoftware dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50755
Roger Sayle changed:
What|Removed |Added
CC||roger at nextmovesoftware dot
com
||roger at nextmovesoftware dot
com
Ever confirmed|0 |1
Last reconfirmed||2023-10-30
|NEW
CC||roger at nextmovesoftware dot
com
Last reconfirmed||2023-10-26
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111267
--- Comment #3 from Roger Sayle ---
This patch addresses the regression, but probably isn't the correct fix.
The issue is that the backend now has a way of representing the concatenation
of two registers (for example, TI is constructed for two
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111267
--- Comment #2 from Roger Sayle ---
Created attachment 56162
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56162&action=edit
proof-of-concept patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110551
Roger Sayle changed:
What|Removed |Added
CC||roger at nextmovesoftware dot
com
|
CC||roger at nextmovesoftware dot
com
Build|powerpc64-linux-gnu |
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110701
Roger Sayle changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=17886
Roger Sayle changed:
What|Removed |Added
CC||roger at nextmovesoftware dot
com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111519
--- Comment #2 from Roger Sayle ---
Complicated. Things have gone wrong before the strlen pass which is given:
_73 = e;
_72 = *_73;
...
*_73 = prephitmp_23;
d = _72;
Here the assignment to *_73 overwrites the value of f (at *e) which
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71749
Roger Sayle changed:
What|Removed |Added
CC||roger at nextmovesoftware dot
com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91251
Roger Sayle changed:
What|Removed |Added
CC||roger at nextmovesoftware dot
com
|--- |8.4
CC||roger at nextmovesoftware dot
com
Status|UNCONFIRMED |RESOLVED
--- Comment #6 from Roger Sayle ---
As reported by Giulio, this bug has now been fixed.
1 - 100 of 429 matches
Mail list logo