https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
--- Comment #22 from Diego Russo ---
Another reason to have this implemented is the CPython JIT. It is a template
(stencil) JIT where every micro OP is precompiled as stencil. At run time these
stencils will be stitched together and patched with
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
--- Comment #21 from Ken Jin ---
I sincerely apologize for my previous performance figures. The baseline was
worse due to a Clang-19 bug https://github.com/llvm/llvm-project/issues/106846.
So the numbers were inaccurate.
On Clang-20, on the pys
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
--- Comment #20 from Ken Jin ---
(In reply to Andrew Pinski from comment #17)
> I am not sure if I understand this correctly.
> Can you make a simple table:
>
> w/o tail-call - 1
> with tail-call but not preserve_none -
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
--- Comment #19 from Diego Russo ---
> Can you make a simple table:
w/o tail-call - 1
with tail-call but not preserve_none - 0.94
with tail-call and preserve_none - 1
You understood correctly.
I think there is st
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
Sam James changed:
What|Removed |Added
CC||fw at gcc dot gnu.org,
|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
--- Comment #17 from Andrew Pinski ---
>Can we have the same implementation/interface of LLVM?
Is there real documentation on this attribute or is it just ad hoc on what it
does on the LLVM side about the ABI implications? It seems to me there
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
--- Comment #16 from Diego Russo ---
Right, I had a couple of problems with running the benchmarks. A few failures
and the wrong environment variable to select the binary of the compiler.
Anyway I re-ran the benchmarks and the binary without pr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
Sam James changed:
What|Removed |Added
Last reconfirmed||2025-02-07
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
--- Comment #15 from Diego Russo ---
Folks, I think I've botched the performance measurement. Need to retake the
measurement. Give me some time and I'll come back with the right results.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
--- Comment #14 from Richard Sandiford ---
(In reply to Sam James from comment #13)
> The request here notwithstanding, bug report(s) with testcases for missed
> opportunities in ipa-ra would be welcome too.
Agreed, if we find any. But just in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
--- Comment #13 from Sam James ---
The request here notwithstanding, bug report(s) with testcases for missed
opportunities in ipa-ra would be welcome too.
(btw, x86 has no_callee_saved_registers / no_caller_saved_registers too.)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
Diego Russo changed:
What|Removed |Added
CC||Diego.Russo at arm dot com
--- Comment #1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
--- Comment #11 from Richard Sandiford ---
Created attachment 60175
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60175&action=edit
Proof-of-concept patch
Here's a lightly-tested proof-of-concept patch for preserve_none on AArch64.
In p
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
--- Comment #10 from Andrew Pinski ---
(In reply to Ken Jin from comment #7)
> The files are too big to upload here, so I've uploaded them to
> https://github.com/Fidget-Spinner/debugging-dump. They correspond to the
> main interpreter loop of C
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
--- Comment #9 from Andrew Pinski ---
(In reply to Ken Jin from comment #7)
> Specifically, zoom in on the function _TAIL_CALL_YIELD_VALUE, it produces on
> GCC 15 (note the assembly here might be slightly different than the one in
> .s file, be
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
--- Comment #8 from Andrew Pinski ---
(In reply to Ken Jin from comment #7)
> The files are too big to upload here, so I've uploaded them to
> https://github.com/Fidget-Spinner/debugging-dump. They correspond to the
> main interpreter loop of CP
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
--- Comment #7 from Ken Jin ---
The files are too big to upload here, so I've uploaded them to
https://github.com/Fidget-Spinner/debugging-dump. They correspond to the main
interpreter loop of CPython
https://github.com/python/cpython/blob/e1988
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
--- Comment #6 from Andrew Pinski ---
(In reply to Ken Jin from comment #5)
> However, it seems to me that there's still extraneous push and pops for
> function prologue/epilogue that could be removed with preserve_none. GCC's
> regalloc is defi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
--- Comment #5 from Ken Jin ---
However, it seems to me that there's still extraneous push and pops for
function prologue/epilogue that could be removed with preserve_none. GCC's
regalloc is definitely a lot better than Clang when both don't hav
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
--- Comment #4 from Ken Jin ---
I can confirm that in the case of tail calls, GCC does produce
better/equivalent register spilling code than clang 19.1.0, by manual
inspection of call sites.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
Ken Jin changed:
What|Removed |Added
CC||kenjin4096 at gmail dot com
--- Comment #3 fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
--- Comment #2 from Richard Sandiford ---
(In reply to Andrew Pinski from comment #1)
> Note most of the use cases in my view for these attributes. These attributes
> are there specifically to work around the fact that llvm does not do ipa ra
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118328
--- Comment #1 from Andrew Pinski ---
Note most of the use cases in my view for these attributes. These attributes
are there specifically to work around the fact that llvm does not do ipa ra and
the compiler does not record which registers are a
23 matches
Mail list logo