https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119727
--- Comment #4 from Andi Kleen ---
Yes but on the OS where you know it it's better to do both to make the runs
more reproducible. There are also bugs that don't reproduce on ASLR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119727
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106767
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119719
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119705
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119482
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114563
--- Comment #10 from Andi Kleen ---
It doesn't really help for the PR119387 test case, perhaps not surprising
because it optimizes freeing not allocation:
Summary
./gcc/cc1plus-opt -w -std=c++20 ~/gcc/git/tsrc/119387-formatted.ii -quiet
ran
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114563
Andi Kleen changed:
What|Removed |Added
Attachment #60907|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64500
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114563
--- Comment #6 from Andi Kleen ---
Created attachment 60907
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60907&action=edit
patch for multiple free lists in ggc-page
I saw it in some profile, but later trying didn't help anymore.
Needs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107048
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119376
--- Comment #18 from Andi Kleen ---
Yes the multiple passes are a problem. They also do redundant work I believe.
But it would be easier to just check opt_tailcalls I think instead of adding a
new variable.
>Plus, given that tail_calls pass use
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119375
--- Comment #4 from Andi Kleen ---
I applied Alex's patches, but had to resolve one conflict. Unfortunately made
no difference in the test results.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119375
--- Comment #2 from Andi Kleen ---
I already had that commit in my tree, since it was already on trunk.
Or did you think of something different?
commit cfdb961588ba318a78e995d2e2cde43130acd993
Author: Alex Coplan
Date: Tue Nov 26 17:48:14 20
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119376
--- Comment #6 from Andi Kleen ---
For the gcc 15 release we could just drop the clang:: support, so it becomes
opt-in? (have to use gnu::musttail)
Assignee: unassigned at gcc dot gnu.org
Reporter: andi-gcc at firstfloor dot org
CC: erozen at microsoft dot com
Target Milestone: ---
Host: x86_64-linux
This can be only seen on x86_64 hosts which have the autofdo tooling installed
(which is not most
: gcov-profile
Assignee: unassigned at gcc dot gnu.org
Reporter: andi-gcc at firstfloor dot org
Target Milestone: ---
According to
https://gcc.opensuse.org/gcc-lcov/gcc/mcf.cc.gcov.html
mcf.cc is not executed at all in standard test suite runs. This is called
through
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116545
--- Comment #5 from Andi Kleen ---
Something like this untested patch would likely also fix the test case:
diff --git a/gcc/c-family/c-lex.cc b/gcc/c-family/c-lex.cc
index e450c9a57f0..e1f78431210 100644
--- a/gcc/c-family/c-lex.cc
+++ b/gcc/c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116545
--- Comment #2 from Andi Kleen ---
It's too late to fix gcc 15, you'll just have to release an update. Sorry.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116087
--- Comment #4 from Andi Kleen ---
After some digging into the code: libcpp already keeps track of how many tokens
get expanded in a global. This is even accessible for through linemap's
statistics dumped on -fmem-report, but only as a averaged
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118288
Andi Kleen changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118168
Andi Kleen changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118168
--- Comment #17 from Andi Kleen ---
With the patches now in trunk the overhead for enabling
-Wmisleading-indentation is now ~32% unless --param=file-cache-lines=1 is
used. With the drop behind cache it would be noise.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118657
--- Comment #9 from Andi Kleen ---
With constexpr you are guaranteed an visible initializer. const would
potentially require messing with IPA and might impossible.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118657
--- Comment #6 from Andi Kleen ---
The C test case needs to use constexpr too.
#define DATA_SIZE 1024
static constexpr int TO_DATA_INDEX[DATA_SIZE] = {};
bool foo(int* data, unsigned char first_idx)
{
int second_idx = TO_DATA_INDEX[first_idx]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118657
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116080
--- Comment #26 from Andi Kleen ---
It's quite difficult to express the rules for tail calls for all targets in
dejagnu effective rules.
Currently we're a bit too aggressive for excluding on Power, but nobody has the
energy to reopen it again,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107827
--- Comment #7 from Andi Kleen ---
We need a validator for x86 assembler length indications in the x86 machine
descriptions, then it could be easily enabled.
This will require patching gas at least for test suite runs.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118442
--- Comment #4 from Andi Kleen ---
The problem seems to be that the call BB has an extra fallthrough edge to the
basic block containing the return. Perhaps it should just have an EXIT edge or
not split the RETURN? (not sure if that is legal in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118442
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118469
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118468
Andi Kleen changed:
What|Removed |Added
Summary|vectorizer: extra phi |vectorizer: if conversion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95834
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
--- Comment
: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: andi-gcc at firstfloor dot org
Target Milestone: ---
This is forked from PR116126 to handle another early exit problem
const unsigned char *search_line_fast2 (const unsigned char *s, const unsigned
char
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118373
--- Comment #13 from Andi Kleen ---
If it immediately reboots (and you didn't use panic=XXX to reboot quickly) then
it might be a triple fault. These are unfortunately harder to debug because
they don't produce any console output in a native set
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118373
--- Comment #8 from Andi Kleen ---
Classical "make photo of panic" screen method should be enough.
The critical part is the Code: line that shows the bad instruction, and the
name of the function.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106883
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118289
--- Comment #2 from Andi Kleen ---
Yes sorry for the dup.
: target
Assignee: unassigned at gcc dot gnu.org
Reporter: andi-gcc at firstfloor dot org
Target Milestone: ---
Target: x86_64
(I suspect it will affect any non aarch64 target)
./gcc/cc1 -I ../gcc/gcc/ginclude/
../gcc/gcc/testsuite/gcc.target/aarch64/crc-builtin
-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: andi-gcc at firstfloor dot org
Target Milestone: ---
Target: x86_64
./gcc/cc1 -I ../gcc/gcc/ginclude/
../gcc/gcc/testsuite/gcc.target/aarch64/crc-builtin-pmul64.c
crc8_data8 crc16_data8 crc16_data16
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118198
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118250
--- Comment #2 from Andi Kleen ---
With
--param=switch-lower-slow-alg-max-cases=1
(so using greedy) trunk includes "38" in the first bit cluster, but the LLVM
code is still better. I've seen the dynamic programing algorithm miss clusters
like
: target
Assignee: unassigned at gcc dot gnu.org
Reporter: andi-gcc at firstfloor dot org
Target Milestone: ---
Inspired by https://github.com/komrad36/CRC
When generating a jump table for switch gcc always uses .long for PIC or .quad
for non PIC. This both wastes code size and
Assignee: unassigned at gcc dot gnu.org
Reporter: andi-gcc at firstfloor dot org
Target Milestone: ---
Inspired by https://github.com/komrad36/CRC
Even though gcc has CRC pattern matching now which should be implemented on x86
too, it would be still good if it handled the manual coded
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118168
--- Comment #11 from Andi Kleen ---
Fix posted here:
https://inbox.sourceware.org/gcc-patches/20241227024559.2224623-1-a...@firstfloor.org/T/#t
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: andi-gcc at firstfloor dot org
Target Milestone: ---
This is the hot loop of the line searching function in gcc input.cc.
It currently fails to vectorize on AVX512F. Would be nice if it could.
% gcc -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113149
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118050
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118168
--- Comment #10 from Andi Kleen ---
My earlier analysis was wrong. The file cache is exactly supposed to avoid
this quadratic case.
But the cache only works if the linemap knows the total number of lines,
otherwise it uses a much slower fallba
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118168
--- Comment #9 from Andi Kleen ---
Created attachment 59954
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59954&action=edit
add tunables for file cache
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118168
--- Comment #8 from Andi Kleen ---
Oh actually it's not the beginning, but some point file size / 100 (the scaled
down line cache)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118168
--- Comment #7 from Andi Kleen ---
Actually in my case where i interrupted and the difference was 60k i think the
problem was that the lexer offset was beyond the 100 lines where the position
is cached, and when that happens the file_cache just
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118168
--- Comment #6 from Andi Kleen ---
So the file cache has a window of 100 lines:
static const size_t line_record_size = 100;
The indentation code rereads the line of the guard, body, next statement and
that is all cached if it's all within 100
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118168
--- Comment #3 from Andi Kleen ---
never mind, i had an old compiler.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118168
--- Comment #1 from Andi Kleen ---
Did you attach the correct file? I get
mypy.c:9524:5: error: implicit declaration of function
‘__builtin_c23_va_start’; did you mean ‘__builtin_ms_va_start’?
[-Wimplicit-function-declaration]
9524 | __bu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117091
--- Comment #9 from Andi Kleen ---
Yes I guess we should keep better switches at -O1 because machine generated
code may have lot of switches.
I don't think we need perfect clustering? Perhaps there is some heuristic that
is good enough. Maybe j
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117091
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
Priority: P3
Component: other
Assignee: unassigned at gcc dot gnu.org
Reporter: andi-gcc at firstfloor dot org
Target Milestone: ---
This is with tramp3d, but I suspect it will happen on other files too.
% ./gcc/cc1plus -ftime-report -fdiagnostics-format=sarif-file
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116520
--- Comment #7 from Andi Kleen ---
Tamas also gave this example in PR115866 which shows the same problem:
short a[100];
int foo(int n, int counter)
{
for (int i = 0; i < n; i++)
{
if (a[i] == 1 || a[i] == 2 || a[i] == 7 || a[i]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115866
--- Comment #8 from Andi Kleen ---
It doesn't even try to convert the switch because of
t.c.179.ifcvt:
Can not ifcvt due to multiple exits
if (loop->num_nodes > 2)
{
/* More than one loop exit is too much to handle. */
if (
: target
Assignee: unassigned at gcc dot gnu.org
Reporter: andi-gcc at firstfloor dot org
Target Milestone: ---
On x86_64-linux:
volatile int a;
void f1(void)
{
a++;
}
int b;
void f2(void)
{
b++;
}
generates
f1:
movla(%rip), %eax
addl$1
Assignee: unassigned at gcc dot gnu.org
Reporter: andi-gcc at firstfloor dot org
Target Milestone: ---
Forked from PR83324. Applies to C/C++
It seems clang supports old style __attribute__ label attributes for musttail
(and presumably others) while gcc only supports the standard
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116520
Andi Kleen changed:
What|Removed |Added
Summary|incorrect |Multiple condition lead to
Assignee: unassigned at gcc dot gnu.org
Reporter: andi-gcc at firstfloor dot org
Target Milestone: ---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116500
--- Comment #4 from Andi Kleen ---
It seems sparc doesn't support comparisons in vectorization?
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-switch-ifcvt-1.c:13:7:
missed: not vectorized: relevant stmt not supported: _13 = _1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116500
--- Comment #2 from Andi Kleen ---
Do you have the dump file from tree-vect?
I guess it just doesn't vectorize something here.
The right fix is probably to skip it for sparc, or adjust the vect_int target
test.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116497
Andi Kleen changed:
What|Removed |Added
Resolution|--- |INVALID
Status|WAITING
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116497
--- Comment #21 from Andi Kleen ---
As HJ pointed out the change is not needed, the compiler DTRT with
no_callee_saved_registers on the callees.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116497
--- Comment #16 from Andi Kleen ---
Created attachment 59013
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59013&action=edit
test case
This test case using Pinski's clobber trick shows the benefit.
If you compile with -O2 -mgeneral-regs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116497
--- Comment #1 from Andi Kleen ---
Disable check for no_caller_saved_registers enforcing non FP.
diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc
index f79257cc764..cec652cc9e6 100644
--- a/gcc/config/i386/i386-opt
: target
Assignee: unassigned at gcc dot gnu.org
Reporter: andi-gcc at firstfloor dot org
CC: hjl.tools at gmail dot com
Target Milestone: ---
Target: x86_64-linux
When writing threaded code interpreters by chaining functions with musttail the
normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116285
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71672
Andi Kleen changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
Assignee: unassigned at gcc dot gnu.org
Reporter: andi-gcc at firstfloor dot org
Target Milestone: ---
Target: x86_64-linux
unsigned fclear(unsigned a, unsigned b)
{
if (a & (1 << 10))
b &= ~(1 << 20);
return b;
}
gives
c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115866
Andi Kleen changed:
What|Removed |Added
Attachment #58804|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116166
--- Comment #13 from Andi Kleen ---
Created attachment 58842
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58842&action=edit
add a param to limit BBs for dominator pass
Maybe something like this patch. It adds a check to disable the dom
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116191
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115866
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116166
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116163
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116080
Andi Kleen changed:
What|Removed |Added
Attachment #58761|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116080
--- Comment #8 from Andi Kleen ---
Patch was reverted, it just made a bunch of tests unsupported.
problems:
- Need unique name for each new test to not confuse the caching
- -O0 tests need to use musttail explictly because the musttail pass
onl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116019
Andi Kleen changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116019
Andi Kleen changed:
What|Removed |Added
Status|RESOLVED|UNCONFIRMED
Resolution|FIXED
-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: andi-gcc at firstfloor dot org
Target Milestone: ---
This is somewhat of a metabug to track vectorization of libcpp/lex.c
search_line_fast, which currently has manual vectorization for various
architectures. It would be
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116019
Andi Kleen changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116080
--- Comment #6 from Andi Kleen ---
Created attachment 58761
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58761&action=edit
Improve test suite tail call checks
This patch should fix it. We must run the test suite tail call probes without
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: andi-gcc at firstfloor dot org
Target Milestone: ---
The Linux kernel hit an interesting problem where a too complicated recursive
macro expansion caused significant compile time slow downs.
https://lore.kernel.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116080
--- Comment #2 from Andi Kleen ---
Also can you upload the whole log files somewhere? I would like to see what the
output of check_effective_target_struct_tail_call is. It should have caught
some of these problems.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116080
--- Comment #1 from Andi Kleen ---
Yes it is known that powerpc (or some flavors of it) has poor tail call support
due to ABI limitations.
Just need to figure out how to skip the test. I guess it needs a better test in
check_effective_target_ta
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116014
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116047
Andi Kleen changed:
What|Removed |Added
CC||andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83355
Andi Kleen changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
++
Assignee: unassigned at gcc dot gnu.org
Reporter: andi-gcc at firstfloor dot org
Target Milestone: ---
(bug requires musttail patchkit comitted)
This is related to PR115606
On targets like ARM where the C++ frontend prevents tail calls returning
structures we also get
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83324
--- Comment #19 from Andi Kleen ---
Middle/back-end parts are in, still need acks for the C/C++ frontend parts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115979
--- Comment #3 from Andi Kleen ---
Doing it in the frontend would require some duplication between C/C++ at least?
I was thinking to just keep searching if has_mustail is set, but was wary of
endless loops walking single basic block precessors.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115255
Andi Kleen changed:
What|Removed |Added
Status|WAITING |RESOLVED
Resolution|---
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: andi-gcc at firstfloor dot org
Target Milestone: ---
(this bug requires committing the remaining pieces of musttail)
When running gcc/testsuite/g++.dg/musttail11.C with -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115813
--- Comment #2 from Andi Kleen ---
Is that the right pattern for the example? It looks different
Enabling match.pd debugging for the scalar version shows:
taddbit.c.034t.ccp1:Applying pattern match.pd:3960, gimple-match.cc:18437
taddbit.c.034t
-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: andi-gcc at firstfloor dot org
Target Milestone: ---
typedef int v4si __attribute__((vector_size(16)));
v4si v(v4si x)
{
x = (x << 1) | 1;
x = (x << 1) | 1;
return
1 - 100 of 589 matches
Mail list logo