I see, thanks Jeff, will make sure the online CI is OK before commit.
Pan
-Original Message-
From: Jeff Law
Sent: Saturday, June 21, 2025 10:32 PM
To: Robin Dapp ; Li, Pan2 ;
gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Chen, Ken ;
Liu, Hongtao
Subject: Re:
> In addition to working with you on the issues of profile being lost with
> LTO, cloning and other cases, my plan is to
> 1) finish the VPT reorganization
> 2) make AFD reader to scale up the profile since at least in data from
> SPEC or profiledbootstrap the counters are quite small integers w
Hi,
This is the last part of the infrastructure to allow functions with
local profiles and 0 global autofdo counts.
Bootstrapped/regtested x86_64-linux, comitted.
gcc/ChangeLog:
* auto-profile.cc (afdo_set_bb_count): Dump inline stacks
and reasons when lookup failed.
(afd
Regtested for target=xtensa-linux-uclibc, no new regressions.
Committed to master.
On Mon, Jun 16, 2025 at 11:56 PM Takayuki 'January June' Suwa
wrote:
>
> This patch implements bitfield insertion MD pattern using the DEPBITS
> machine instruction, the counterpart of the EXTUI instruction, if
> av
Hi Tomasz and others,
please don't review this. I found some preprequisites I'm not
checking. While implementing the checks for std::extents, I
got saved by one of the tests that exercises the code with an
IntLike (instead of an int).
Therefore, the first task has to be to tighten up the testing
Hi,
auto-fdo is currently confused by a fact that all inlined functions get
locators with 0 discriminator, so it is not bale to distinguish multiple
inlined calls from single line.
Discriminator is lost by calling LOCATION_LOCUS before copying it from
former call statement. I believe this is only
On Sun, Jun 22, 2025 at 1:32 PM Jan Hubicka wrote:
>
> > Since there is
> >
> > /* X86_TUNE_SPLIT_LONG_MOVES: Avoid instructions moving immediates
> >directly to memory. */
> > DEF_TUNE (X86_TUNE_SPLIT_LONG_MOVES, "split_long_moves", m_PPRO)
>
> If I recall correctly, this tune was added for
> Since there is
>
> /* X86_TUNE_SPLIT_LONG_MOVES: Avoid instructions moving immediates
>directly to memory. */
> DEF_TUNE (X86_TUNE_SPLIT_LONG_MOVES, "split_long_moves", m_PPRO)
If I recall correctly, this tune was added for PentiumPro which had
problem decoding moves with long immediate an
>
> Since read-modify-write is enabled for PentiumPro:
>
> /* X86_TUNE_READ_MODIFY_WRITE: Enable use of read modify write instructions
>such as "add $1, mem". */
> DEF_TUNE (X86_TUNE_READ_MODIFY_WRITE, "read_modify_write",
> ~(m_PENT | m_LAKEMONT))
>
> should this
>
> /* Generate
> This contradicts
>
> /* X86_TUNE_READ_MODIFY_WRITE: Enable use of read modify write instructions
>such as "add $1, mem". */
> DEF_TUNE (X86_TUNE_READ_MODIFY_WRITE, "read_modify_write",
> ~(m_PENT | m_LAKEMONT))
>
> which enables "andl $0, (%edx)" for PentiumPro. "andl $0, (%edx
On Sun, Jun 22, 2025 at 2:12 PM Jan Hubicka wrote:
>
> >
> > Since read-modify-write is enabled for PentiumPro:
> >
> > /* X86_TUNE_READ_MODIFY_WRITE: Enable use of read modify write instructions
> >such as "add $1, mem". */
> > DEF_TUNE (X86_TUNE_READ_MODIFY_WRITE, "read_modify_write",
> >
From: Mikael Morin
See the description in the ChangeLog entry below.
The testcases are best effort; for some operators the fortran frontend
generates a temporary variable, so the simplification doesn't happen.
Those cases are not tested.
Regression tested on x86_64-linux. OK for master?
-- 8<
Hi Mikael!
Am 20.06.25 um 12:08 schrieb Mikael Morin:
From: Mikael Morin
Regression-tested on x86_64-pc-linux-gnu.
Ok for master?
-- >8 --
The temporary variables that are generated to implement SELECT TYPE
and TYPE IS statements have (before this change) a name depending only
on the typ
On 2025/06/23 6:20, H.J. Lu wrote:
On Sun, Jun 22, 2025 at 9:54 PM Max Filippov wrote:
On Sun, Jun 22, 2025 at 5:49 AM Takayuki 'January June' Suwa
wrote:
On 2025/06/22 6:41, Max Filippov wrote:
On Sat, Jun 21, 2025 at 2:12 PM Takayuki 'January June' Suwa
wrote:
That hook has since been
acats' fdd2a00.read is miscompiled on arm-linux-gnu with -O2
-fstack-clash-protection -march=armv7-a -marm: a clobbered scratch
register in a *iorsi3_compare0_scratch pattern gets initially assigned
to the frame pointer register, but at some point during lra the frame
size grows to nonzero, arm_f
An x86_64-linux-gnu native with ix86_frame_pointer_required modified
to return true for nonzero frames, to exercize
lra_update_fp2sp_elimination, reveals in stage1 testing that wrong
code is generated for gcc.c-torture/execute/ieee/fp-cmp-8l.c:
argp-to-sp eliminations are used for one_test to pas
> Add a PROCESSOR_XXX comment to each entry in processor_cost_table to
> describe which processor the cost enry is applied to.
>
> * config/i386/i386-options.cc (processor_cost_table): Add a
> PROCESSOR_XXX comment to each entry.
>
>
> --
> H.J.
> From 8b37db60ec21c1c673eb1e336208dc10a5d86d5c
And here's a followup to clean up the mess I made in
lra_update_fp2sp_elimination, without any functional changes.
The various recent additions to lra_update_fp2sp_elimination rendered
it somewhat confusing, with intermixed groups of statements pertaining
to three different major actions: disabli
On Jun 13, 2025, Vladimir Makarov wrote:
>> * lra-eliminations.cc (lra_update_fp2sp_elimination):
>> Inactivate the unused fp2sp elimination right away.
Alas, this seems to cause trouble on arm-linux-gnueabihf bootstraps.
Here's an alternate approach that builds on it to solves the earlier
prob
When attempting to bootstrap arm-linux-gnueabihf with
{BOOT_C,T}FLAGS='-g -O2 -fnon-call-exceptions
-fstack-clash-protection', gmp fails to build in stage2: gen-fac's
mpz_and gets miscompiled.
A pseudo is initialized before a loop and used in a PRE_INC load
inside a loop. It gets spilled just a
On Sun, Jun 22, 2025 at 9:54 PM Max Filippov wrote:
>
> On Sun, Jun 22, 2025 at 5:49 AM Takayuki 'January June' Suwa
> wrote:
> >
> > On 2025/06/22 6:41, Max Filippov wrote:
> > > On Sat, Jun 21, 2025 at 2:12 PM Takayuki 'January June' Suwa
> > > wrote:
> > >>
> > >> That hook has since been dep
Add a PROCESSOR_XXX comment to each entry in processor_cost_table to
describe which processor the cost enry is applied to.
* config/i386/i386-options.cc (processor_cost_table): Add a
PROCESSOR_XXX comment to each entry.
--
H.J.
From 8b37db60ec21c1c673eb1e336208dc10a5d86d5c Mon Sep 17 00:00:00 2
On Mon, Jun 23, 2025 at 11:03 AM H.J. Lu wrote:
>
> Add a PROCESSOR_XXX comment to each entry in processor_cost_table to
> describe which processor the cost enry is applied to.
Ok as obvious.
>
> * config/i386/i386-options.cc (processor_cost_table): Add a
> PROCESSOR_XXX comment to each entry.
>
>
On Sat, Jun 21, 2025 at 11:09 PM H.J. Lu wrote:
>
> On Fri, Jun 20, 2025 at 4:12 PM H.J. Lu wrote:
> >
> > Don't use vmovdqu16/vmovdqu8 with non-EVEX registers even if AVX512BW is
> > available.
> >
> > gcc/
> >
> > PR target/120728
> > * config/i386/i386.cc (ix86_get_ssemov): Use vmovdqu16/vmovd
On Sun, 22 Jun 2025, Jan Hubicka wrote:
> Hi,
> auto-fdo is currently confused by a fact that all inlined functions get
> locators with 0 discriminator, so it is not bale to distinguish multiple
> inlined calls from single line.
>
> Discriminator is lost by calling LOCATION_LOCUS before copying i
On Sat, Jun 21, 2025 at 3:54 PM H.J. Lu wrote:
> On Sun, Jun 22, 2025 at 6:35 AM Max Filippov wrote:
> > On Sat, Jun 21, 2025 at 2:41 PM Max Filippov wrote:
> > > On Sat, Jun 21, 2025 at 2:12 PM Takayuki 'January June' Suwa
> > > wrote:
> > > >
> > > > That hook has since been deprecated
> > >
Hi,
This patch adds GUESSED_GLOBAL0_AFDO profile quality. It can
be used to preserve local counts of functions which have 0 AFDO
profile.
I originally did not include it as it was not clear it will be useful and
it turns quality from 3bits to 4bits which means that we need to steal another
bit fro
On Mon, Jun 16, 2025 at 6:33 AM Takayuki 'January June' Suwa
wrote:
>
> This patch implements the target-specific ZERO_CALL_USED_REGS hook, since
> if -fzero-call-used-regs=all the default hook tries to assign 0 to B0
> (bit 0 of the BR register) and the ICE will be thrown.
>
> gcc/ChangeLog:
>
>
On Fri, Jun 20, 2025 at 10:04 AM Haochen Jiang wrote:
>
> Hi all,
>
> CLDEMOTE is not enabled on clients according to SDM. SDM only mentioned
> it will be enabled on Xeon and Atom servers, not clients. Remove them
> since Alder Lake (where it is introduced).
>
> Also will backport this patch to GC
Hi!
I'd like to ping some C family patches:
https://gcc.gnu.org/pipermail/gcc-patches/2025-April/681741.html
- PR44677 - c, c++: Extend -Wunused-but-set-* warnings
https://gcc.gnu.org/pipermail/gcc-patches/2025-June/685543.html
- PR120520 - Extend nonnull_if_nonzero attribute
plus a questio
Hi,
This patch fixes problems I noticed by exploring profiles of some hot
functions in GCC. In particular the propagation sometimes changed
precise 0 to afdo 0 for paths calling abort and sometimes we could
propagate more when we accept that some paths has 0 count.
Finally there was important bug
On Sun, Jun 22, 2025 at 5:49 AM Takayuki 'January June' Suwa
wrote:
>
> On 2025/06/22 6:41, Max Filippov wrote:
> > On Sat, Jun 21, 2025 at 2:12 PM Takayuki 'January June' Suwa
> > wrote:
> >>
> >> That hook has since been deprecated
> >> (commit a670ebde3995481225ec62b29686ec07a21e5c10) and has
So this is Andrew's patch from the PR. We weren't clean for a 32bit
host in some of the arithmetic for constant synthesis.
I confirmed the bug on a 32bit linux host, then confirmed that Andrew's
patch from the PR fixes the problem, then ran Andrew's patch through my
tester successfully.
Nat
On Sun, Jun 22, 2025 at 2:57 PM Jan Hubicka wrote:
>
> > This contradicts
> >
> > /* X86_TUNE_READ_MODIFY_WRITE: Enable use of read modify write instructions
> >such as "add $1, mem". */
> > DEF_TUNE (X86_TUNE_READ_MODIFY_WRITE, "read_modify_write",
> > ~(m_PENT | m_LAKEMONT))
> >
>
On 2025/06/22 6:41, Max Filippov wrote:
On Sat, Jun 21, 2025 at 2:12 PM Takayuki 'January June' Suwa
wrote:
That hook has since been deprecated
(commit a670ebde3995481225ec62b29686ec07a21e5c10) and has led to incorrect
results on Xtensa:
/* example */
#define
uint32_t __att
This bug was found by Edwin's fuzzing efforts on RISC-V, though it
likely affects other targets.
In simplest terms when ext-dce converts an extension into a (possibly
simplified) subreg copy it may make an attached REG_EQUAL note invalid.
In the case Edwin found the note was an extension, but
36 matches
Mail list logo