On Mon, 7 Nov 2022, Alexander Monakov wrote:
>
> On Tue, 1 Nov 2022, Alexander Monakov wrote:
>
> > Hi,
> >
> > I'm sending followup fixes for combinatorial explosion of znver scheduling
> > automaton tables as described in the earlier thread:
> >
> > https://inbox.sourceware.org/gcc-patches
On Mon, 14 Nov 2022, Joshi, Tejas Sanjay wrote:
> [Public]
>
> Hi,
Hi. I'm still waiting for feedback on fixes for existing models:
https://inbox.sourceware.org/gcc-patches/5ae6fc21-edc6-133-aee2-a41e16eb...@ispras.ru/T/#t
did you have a chance to look at those?
> PFA the patch which adds znv
On Tue, 15 Nov 2022, Joshi, Tejas Sanjay wrote:
> > > +;; AVX instructions
> > > +(define_insn_reservation "znver4_sse_log" 1
> > > + (and (eq_attr "cpu" "znver4")
> > > + (and (eq_attr "type" "sselog,sselog1")
> > > +
On Tue, 15 Nov 2022, Jonathan Wakely via Gcc-patches wrote:
> > @item -mrelax-cmpxchg-loop
> > @opindex mrelax-cmpxchg-loop
> >-Relax cmpxchg loop by emitting an early load and compare before cmpxchg,
> >-execute pause if load value is not expected. This reduces excessive
> >-cachline bouncing whe
On Tue, 15 Nov 2022, Jonathan Wakely wrote:
> > How about the following:
> >
> > When emitting a compare-and-swap loop for @ref{__sync Builtins}
> > and @ref{__atomic Builtins} lacking a native instruction, optimize
> > for the highly contended case by issuing an atomic load before the
> > @code
On Wed, 16 Nov 2022, Hongyu Wang wrote:
> > When emitting a compare-and-swap loop for @ref{__sync Builtins}
> > and @ref{__atomic Builtins} lacking a native instruction, optimize
> > for the highly contended case by issuing an atomic load before the
> > @code{CMPXCHG} instruction, and using the
On Wed, 16 Nov 2022, Jan Hubička wrote:
> This looks really promising. I will experiment with the patch for separate
> znver3 model, but I think we should be able to keep
> them unified and hopefully get both less code duplicatoin and table sizes.
Do you mean separate znver4 (not '3') model (i
When instrumentation is requested via -fsanitize-coverage=trace-pc, GCC
emits calls to __sanitizer_cov_trace_pc callback into each basic block.
This callback is supposed to be implemented by the user, and should be
able to identify the containing basic block by inspecting its return
address. Tailca
Clean up confusing changes from the recent refactoring for
parallel match.pd build.
gimple-match-head.o is not built. Remove related flags adjustment.
Autogenerated gimple-match-N.o files do not depend on
gimple-match-exports.cc.
{gimple,generic)-match-auto.h only depend on the prerequisites of
On Fri, 5 May 2023, Tamar Christina wrote:
> > > Am 05.05.2023 um 19:03 schrieb Alexander Monakov via Gcc-patches > patc...@gcc.gnu.org>:
> > >
> > > Clean up confusing changes from the recent refactoring for parallel
> > > match.pd build.
> >
ean up match.pd-related dependencies
> >
> >
> > On Fri, 5 May 2023, Tamar Christina wrote:
> >
> > > > > Am 05.05.2023 um 19:03 schrieb Alexander Monakov via Gcc-patches
> > > > > > > > patc...@gcc.gnu.org>:
> > &
On Fri, 5 May 2023, Alexander Monakov wrote:
> > > gimple-head-export.cc does not exist.
> > >
> > > gimple-match-exports.cc is not a generated file. It's under source
> > > control and
> > > edited independently from genmatch.cc. It is compiled separately,
> > > producing
> > > gimple-match-ex
I'm trying to study match.pd/genmatch with the eventual goal of
improving match-and-simplify code generation. Here's some trivial
cleanups for the recent refactoring in the meantime.
Alexander Monakov (3):
genmatch: clean up emit_func
genmatch: clean up showUsage
genmatch: fixup get_out_file
Display usage more consistently and get rid of camelCase.
gcc/ChangeLog:
* genmatch.cc (showUsage): Reimplement as ...
(usage): ...this. Adjust all uses.
(main): Print usage when no arguments. Add missing 'return 1'.
---
gcc/genmatch.cc | 21 ++---
1 fil
get_out_file did not follow the coding conventions (mixing three-space
and two-space indentation, missing linebreak before function name).
Take that as an excuse to reimplement it in a more terse manner and
rename as 'choose_output', which is hopefully more descriptive.
gcc/ChangeLog:
*
Eliminate boolean parameters of emit_func. The first ('open') just
prints 'extern' to generated header, which is unnecessary. Introduce a
separate function to use when finishing a declaration in place of the
second ('close').
Rename emit_func to 'fp_decl' (matching 'fprintf' in length) to unbreak
On Wed, 25 May 2022, liuhongt via Gcc-patches wrote:
> Rigt now, mem_cost for separate mem alternative is 1 * frequency which
> is pretty small and caused the unnecessary SSE spill in the PR, I've tried
> to rework backend cost model, but RA still not happy with that(regress
> somewhere else). I t
> > In the PR, the spill happens in the initial basic block of the function,
> > i.e.
> > the one with the highest frequency.
> >
> > Also as noted in the PR, swapping the 'unlikely' branch to 'likely' avoids
> > the spill,
> > even though it does not affect the frequency of the initial basic bl
On Mon, 30 May 2022, Hongtao Liu wrote:
> On Mon, May 30, 2022 at 2:22 PM Alexander Monakov via Gcc-patches
> wrote:
> > >
> > > The spill is mainly decided by 3 insns related to r92
> > >
> > > 283(insn 3 61 4 2 (set (reg/v:SF 92 [ x ])
> &
On Mon, 30 May 2022, Hongtao Liu wrote:
> On Mon, May 30, 2022 at 3:44 PM Alexander Monakov wrote:
> >
> > On Mon, 30 May 2022, Hongtao Liu wrote:
> >
> > > On Mon, May 30, 2022 at 2:22 PM Alexander Monakov via Gcc-patches
> > > wrote:
> > > >
On Thu, 16 Jun 2022, Martin Liška wrote:
> Hi.
>
> I'm sending updated version of the patch where I addressed the comments.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?
I noticed a typo (no objection on the substance on the patch from me
Hello Tom,
Thank you for the investigation and the detailed writeup. It was difficult for
me to infer the internal API contracts here (and still is), sorry about the
mistake.
Most importantly: does GCN handle this, and if yes, how? I think the solution
should be the same for config/gcn and config
On Wed, 21 Apr 2021, Tom de Vries wrote:
> > I don't think implementing futex_wait is possible on nvptx.
> >
>
> Well, I gave it a try, attached below. Can you explain why you think
> it's not possible, or pinpoint a problem in the implementation?
Responding only to this for now. When I said f
On Thu, 22 Apr 2021, Tom de Vries wrote:
> Ah, I see, agreed, that makes sense. I was afraid there was some
> fundamental problem that I overlooked.
>
> Here's an updated version. I've tried to make it clear that the
> futex_wait/wake are locally used versions, not generic functionality.
Could
On Sun, 20 Nov 2022, Jeff Law wrote:
> > The concern, as far as I understand would be the case where the
> > assembly-sequence leaves an incompatible extension in the register.
>
> Right. The question in my mind is whether or not the responsibility should be
> on the compiler or on the develop
On Mon, 21 Nov 2022, Joshi, Tejas Sanjay wrote:
> I have addressed all your comments in the patch attached here. I have also
> used znver4-direct for avx512 insns.
Thanks.
> * This patch increased the insn-automata.cc size from 201502 to 214902.
Assuming it's the number of lines of code, I ha
On Mon, 21 Nov 2022, Jeff Law wrote:
> They're writing assembly code -- in my book that means they'd better have a
> pretty good understanding of the architecture, its limitations and quirks.
That GCC ties together optimization and inline asm interface via its internal
TARGET_MODE_REP_EXTENDED
On Thu, 1 Dec 2022, Joshi, Tejas Sanjay wrote:
> I have addressed all your comments in this revised patch, PFA and inlined
> below.
Thank you. Honza, please let me know if any further input is needed
from my side. For reference, here's how insn-automata.o table sizes
look with this patch (top
Model the divider in Lujiazui processors as a separate automaton to
significantly reduce the overall model size. This should also result
in improved accuracy, as pipe 0 should be able to accept new
instructions while the divider is occupied.
It is unclear why integer divisions are modeled as if pi
Greetings!
While testing our patch that reimplements -Wclobbered on GIMPLE we found
a case where tree-ssa-sink moves a statement to a basic block in front
of a setjmp call.
I am confident that this is unintended and should be considered invalid
GIMPLE. One of the edges incoming to a setjmp BB wil
On Mon, 13 Dec 2021, Richard Biener wrote:
> On December 13, 2021 3:25:47 PM GMT+01:00, Alexander Monakov
> wrote:
> >Greetings!
> >
> >While testing our patch that reimplements -Wclobbered on GIMPLE we found
> >a case where tree-ssa-sink moves a statement to a basic block in front
> >of a setjm
On Mon, 3 Jan 2022, Richard Biener wrote:
> > @@ -5674,6 +5675,14 @@ gimple_verify_flow_info (void)
> >err = 1;
> > }
> >
> > + if (prev_stmt && stmt_starts_bb_p (stmt, prev_stmt))
>
> stmt_starts_bb_p is really a helper used during CFG build, I'd rather
> tes
> I approved the initial sink patch (maybe not clearly enough).
I wasn't entirely happy with that patch. The new version solves this better.
> Can you open
> a bugreport about the missing CFG verification and list the set of FAILs
> (all errors in some passes similar to the one you fixed in sinki
gcc/ChangeLog:
* tree-ssa-sink.c (select_best_block): Punt if selected block
has incoming abnormal edges.
gcc/testsuite/ChangeLog:
* gcc.dg/setjmp-7.c: New test.
---
gcc/testsuite/gcc.dg/setjmp-7.c | 13 +
gcc/tree-ssa-sink.c | 6 ++
2 files
A returns_twice call may have associated abnormal edges that correspond
to the "second return" from the call. If the call is duplicated, the
copies of those edges also need to be abnormal, but e.g. tracer does not
enforce that. Just prohibit the (unlikely to be useful) duplication.
gcc/ChangeLog:
When a returns_twice call has an associated abnormal edge, the edge
corresponds to the "second return" from the call. It wouldn't make sense
if any executable statements appeared between the call and the
destination of the edge (they wouldn't be re-executed upon the "second
return"), so verify that
On Mon, 20 Mar 2023, Kewen.Lin wrote:
> Hi,
Hi. Thank you for the thorough analysis. Since I analyzed
PR108519, I'd like to offer my comments.
> As PR108273 shows, when there is one block which only has
> NOTE_P and LABEL_P insns at non-debug mode while has some
> extra DEBUG_INSN_P insns at d
On Tue, 21 Mar 2023, Jeff Law via Gcc-patches wrote:
> On 3/21/23 11:00, Qing Zhao via Gcc-patches wrote:
> >
> >> On Mar 21, 2023, at 12:56 PM, Paul Koning wrote:
> >>
> >>> On Mar 21, 2023, at 11:01 AM, Qing Zhao via Gcc-patches
> >>> wrote:
> >>>
> >>> ...
> >>> Most of the compiler users
On Wed, 22 Mar 2023, Richard Biener wrote:
> I think it's even less realistic to expect users to know the details of
> floating-point math. So I doubt any such sentence will be helpful
> besides spreading some FUD?
I think it's closer to "fundamental notions" rather than "details". For
users w
On Mon, 20 Mar 2023, Jakub Jelinek via Gcc-patches wrote:
> On Mon, Mar 20, 2023 at 10:05:57PM +, Qing Zhao via Gcc-patches wrote:
> > My question: is the above section the place in C standard “explicitly
> > allows contractions”? If not, where it is in C standard?
>
> http://port70.net/%7
Do not attempt to use a plain subtraction for generating a three-way
comparison result in autopref_rank_for_schedule qsort comparator, as
offsets are not restricted and subtraction may overflow. Open-code
a safe three-way comparison instead.
gcc/ChangeLog:
PR rtl-optimization/109187
101 - 141 of 141 matches
Mail list logo