https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116675
--- Comment #7 from Hongtao Liu ---
(In reply to Rainer Orth from comment #6)
> The test is broken:
>
> +UNRESOLVED: gcc.target/i386/pr116675.c scan-assembler-times pand 4
> +UNRESOLVED: gcc.target/i386/pr116675.c scan-assembler-times pandn 4
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117734
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |INVALID
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117823
--- Comment #3 from Hongtao Liu ---
> Whether it needs -ffast-math depends on how it behaves with respect to
> rounding I guess. If (float)bf16 * (float)bf16 + (float)bf16 * (float)bf16
> performs the float add without intermediate rounding for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117562
--- Comment #2 from Hongtao Liu ---
My guess there's a lower-tripcount(< 128bit vector) hot loop,
avx512_two_epilogues only takes more cmp/jcc instructions but doesn't execute
any real vector instructions.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117495
Hongtao Liu changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117418
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117542
--- Comment #5 from Hongtao Liu ---
> Yes, something like this should work. I suggest to polish up a patch
> with this also containing the backend pattern adjustments and post it
> for review. The alternative is a convert optab for vec_pack_t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117542
--- Comment #3 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #2)
> (In reply to Richard Biener from comment #1)
> > It doesn't even unambiguously specify whether the mode is that of the source
> > or the destination. The original i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117697
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115438
--- Comment #8 from Hongtao Liu ---
>
> This might in the end be fallout of different sinking?!
>
> One difference wrt SLP vs. non-SLP is that with SLP we are taking the
> initial value as the initial value with SLP while with non-SLP we
> ar
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117608
--- Comment #6 from Hongtao Liu ---
(In reply to Jakub Jelinek from comment #4)
> int i;
>
> void
> foo (void)
> {
> __builtin_prefetch (&i, 2, 0);
> }
>
> ICEs as well since that revision, and I think it actually ICEs on many
> targets as w
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115438
--- Comment #7 from Hongtao Liu ---
I only observed ~3% regression on ICX, the regressed one takes less
instructions but more backend bounds, caused lower IPC and slow down
performance.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117734
--- Comment #1 from Hongtao Liu ---
But there's a saturation inside pmaddubsw, not a simple dot_prod pattern.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117006
Hongtao Liu changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |liuhongt at gcc dot
gnu.org
Las
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117860
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 113600, which changed state.
Bug 113600 Summary: [14/15 regression] 525.x264_r run-time regresses by 8% with
PGO -Ofast -march=znver4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600
What|Removed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 113600, which changed state.
Bug 113600 Summary: [14/15 regression] 525.x264_r run-time regresses by 8% with
PGO -Ofast -march=znver4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600
What|Removed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116675
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 116675, which changed state.
Bug 116675 Summary: No blend constant permute for V8HImode with just SSE2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116675
What|Removed |Added
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117608
--- Comment #2 from Hongtao Liu ---
@hulin please take a look.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117006
--- Comment #7 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #6)
> (In reply to Jakub Jelinek from comment #5)
> > So if anything, one would need to decide this on something larger rather
> > than small testcases, say build the whol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117888
Hongtao Liu changed:
What|Removed |Added
Status|UNCONFIRMED |ASSIGNED
Ever confirmed|0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117888
--- Comment #1 from Hongtao Liu ---
This is the case which failed the recogonize innermost correctly.
typedef unsigned short ggml_fp16_t;
static float table_f32_f16[1 << 16];
inline static float ggml_lookup_fp16_to_fp32(ggml_fp16_t f) {
un
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117888
Bug ID: 117888
Summary: cunrolli doesn't accurately remember what's
"innermost"
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Keywords: missed-optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117888
--- Comment #3 from Hongtao Liu ---
(In reply to Richard Biener from comment #2)
> The question is how we should define innermost - consider
>
> - loop interchange
> - inlining of a function body with a loop into a loop
>
> the simplest appr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117006
Hongtao Liu changed:
What|Removed |Added
Status|ASSIGNED|NEW
Assignee|liuhongt at gcc do
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117890
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117946
Hongtao Liu changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |liuhongt at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117946
--- Comment #6 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #5)
> (In reply to Hongtao Liu from comment #4)
> > The insn is generated by avoid_store_fowarding, and it is valid but failed
> > reload
>
> Reload want to find a insn t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117946
--- Comment #4 from Hongtao Liu ---
The insn is generated by avoid_store_fowarding, and it is valid but failed
reload
170Store forwarding detected:
171From: (insn 24 23 25 2 (set (mem/c:SI (pl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117946
--- Comment #5 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #4)
> The insn is generated by avoid_store_fowarding, and it is valid but failed
> reload
Reload want to find a insn to move data from GPR to SSE_REGS but
*movti_internal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117946
Hongtao Liu changed:
What|Removed |Added
Assignee|liuhongt at gcc dot gnu.org|unassigned at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117946
--- Comment #7 from Hongtao Liu ---
5024 Choosing alt 6 in insn 295: (0) ?jc (1) Yd {*movti_internal}
(sp_off=-128)
5025 Change to class INDEX_GPR16 for r273
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117946
--- Comment #8 from Hongtao Liu ---
> Why class is changed to INDEX_GPR16 for r273
Note with -mapxf, ICE disappears
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118333
--- Comment #2 from Hongtao Liu ---
(In reply to Uroš Bizjak from comment #1)
> (In reply to David Binderman from comment #0)
> > Static analyser cppcheck says:
> >
> > gcc/config/i386/i386-expand.cc:24871:35: warning: Identical condition
> > '
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118189
Bug ID: 118189
Summary: Weired vec_contruct of elements who's from continuous
memory
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Keywords: missed-optimizati
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117082
--- Comment #6 from Hongtao Liu ---
(In reply to H.J. Lu from comment #5)
> It isn't a dup of PR 117081 since it is a different failure.
But it's caused by the same commit and the same rootcause?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79786
--- Comment #11 from Hongtao Liu ---
(In reply to Andrew Pinski from comment #8)
> (In reply to Hongtao Liu from comment #7)
> > (In reply to Richard Biener from comment #6)
> > > Hongtao - do we care about -miamcu? Should we eventually deprecat
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117081
--- Comment #8 from Hongtao Liu ---
(In reply to H.J. Lu from comment #7)
> Created attachment 60350 [details]
> ira: Don't increase callee-saved register cost by 1000x
NOTE, r15-1619-g3b9b8d6cfdf593 improved 500.perlbench_r on many different
p
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108707
--- Comment #11 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #10)
> (In reply to Pranav Gorantla from comment #9)
> > Facing similar issue in gcc-13. Is it possible to backport the fix of this
> > Bug 108707 and Bug 109610 to gcc-1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118623
--- Comment #17 from Hongtao Liu ---
(In reply to Jakub Jelinek from comment #15)
> Created attachment 60411 [details]
> gcc15-pr118623.patch
>
> Untested patch which seems to work for me on the new testcases and
> i386.exp=bt*.c so far.
When
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118623
--- Comment #16 from Hongtao Liu ---
(In reply to Jakub Jelinek from comment #14)
> So, if (reg:CCC flags) being non-zero in RTL means nc and (reg:CCC flags)
> being zero in RTL means c, shouldn't *bt be using (compare:CCC
> (zero_extract ...) (
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117081
--- Comment #16 from Hongtao Liu ---
(In reply to H.J. Lu from comment #15)
> r15-7400-gd3ff498c478ace gave
>
> $ cat x.c
> int f (int);
> int
> advance (int dz)
> {
> if (dz > 0)
> return (dz + dz) * dz;
> else
> return dz * f (dz)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117081
--- Comment #9 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #8)
> (In reply to H.J. Lu from comment #7)
> > Created attachment 60350 [details]
> > ira: Don't increase callee-saved register cost by 1000x
>
> NOTE, r15-1619-g3b9b8d6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108707
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117081
--- Comment #14 from Hongtao Liu ---
> can be sinked to else branch(as sub + mov). When jle .L2 is not taken,
> it can save one push instruction. And that's why 511.povray_r is improved.
plus one pop instruction.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117081
--- Comment #13 from Hongtao Liu ---
(In reply to H.J. Lu from comment #10)
> (In reply to Hongtao Liu from comment #9)
> > (In reply to Hongtao Liu from comment #8)
> > > (In reply to H.J. Lu from comment #7)
> > > > Created attachment 60350 [d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117888
Hongtao Liu changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117874
--- Comment #11 from Hongtao Liu ---
(In reply to Richard Biener from comment #10)
> The mult_su3_an part is now resolved. See PR117888 for the rest.
Fixed by r15-6097-gee2f19b0937b5efc0b23c4319cbd4a38b27eac6e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118017
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118055
--- Comment #3 from Hongtao Liu ---
>
> Is it perhaps that the test is brittle; mostly target-specific despite being
> at the tree-level and that instead the scan-test should be a specific
> known-matching target list?
The testcase is used to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118055
--- Comment #1 from Hongtao Liu ---
I explained in the thread.
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/671289.html
-
BTW arm ci reported 2 regressed testcase so I added
* gcc.dg/tree-ssa/pr83403-1.c: Add --param max-
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118021
Bug ID: 118021
Summary: [15 regression] ICE in parser
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c++
Assi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117562
--- Comment #7 from Hongtao Liu ---
> Huh. It looks like this is from a V4SF -> 2xV2DF extension via
> vec_unpack_{hi,lo}_expr.
>
> Originally this is
>
> (insn 1161 1160 1162 58 (set (reg:V4SF 853)
> (vec_select:V4SF (vec_concat:V8S
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117562
--- Comment #8 from Hongtao Liu ---
> vec_unpacks_hi_v4sf create an unintialized (reg:V4SF 853), I guess it may
> confuse LRA to allocate a mem for it.
For simple case
void
foo (double* a, float* b, int n)
{
for (int i = 0; i != n; i++)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118380
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115777
--- Comment #10 from Hongtao Liu ---
> That's probably the conservative answer for BB vectorization, for loop vect
> we know all those uses will be also in vector code. For BB vectorization
> there is currently no easly reliable check to ensur
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118551
--- Comment #3 from Hongtao Liu ---
(In reply to Andrew Pinski from comment #1)
> I think this is similar to pr 113646 really.
Looks like PR 113646 is PGO not autofdo, so the issue could be different.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118551
--- Comment #2 from Hongtao Liu ---
A hack like below can recove performance and further improved 538.imagick_r by
5% w/ autofdo.
The hack prevents the scaling if ipa_count is zero but function body is hot.
diff --git a/gcc/predict.cc b/gcc/pr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118581
Bug ID: 118581
Summary: auto_profile can't annotate bb with all debug_stmt
which assigned value with constant
Product: gcc
Version: 15.0
Status: UNCONFIRMED
S
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118623
--- Comment #10 from Hongtao Liu ---
> > r12-7751-g919fbffef07555
>
> that might have just exposed a latent issue
Should be, the guilty commit just extent a splitter to handle reversed
condition, didn't see anything abnormal.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118623
--- Comment #12 from Hongtao Liu ---
1370Trying 35 -> 20:
1371 35: flags:CCC=cmp(zero_extract(r104:SI,0x1,r105:SI#0),0)
1372 REG_DEAD r104:SI
1373 REG_DEAD r105:SI
1374 20: pc={(flags:CCC!=0)?L26:pc}
1375 REG_BR_PROB 107374183
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118623
--- Comment #11 from Hongtao Liu ---
283(insn 8 7 9 2 (set (reg:SI 107)
284(const_int 1 [0x1])) "test.c":3:7 -1
285 (nil))
286(insn 9 8 10 2 (parallel [
287(set (reg:SI 106 [ e_7 ])
288(ashift:SI (reg:SI 1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118551
Bug ID: 118551
Summary: Autofdo regressed 538.imagick_r by ~10% with
-march=x86-64-v3 -O2
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118581
--- Comment #4 from Hongtao Liu ---
Note it's from SPEC2017 519.lbm_r
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118581
--- Comment #5 from Hongtao Liu ---
(In reply to Richard Biener from comment #2)
> (In reply to Richard Biener from comment #1)
> > Does it have counter info for PHI arguments (aka copies emitted on those
> > edges)?
>
> I think yes, so IMO it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118581
--- Comment #3 from Hongtao Liu ---
(In reply to Richard Biener from comment #2)
> (In reply to Richard Biener from comment #1)
> > Does it have counter info for PHI arguments (aka copies emitted on those
> > edges)?
>
> I think yes, so IMO it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89863
Bug 89863 depends on bug 118333, which changed state.
Bug 118333 Summary: gcc/config/i386/i386-expand.cc:24871: Pointless condition ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118333
What|Removed |Added
-
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118333
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118489
Hongtao Liu changed:
What|Removed |Added
Last reconfirmed|2025-01-16 00:00:00 |
Target Milestone|15.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118489
Hongtao Liu changed:
What|Removed |Added
Status|UNCONFIRMED |ASSIGNED
Assignee|unassigned at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118489
Hongtao Liu changed:
What|Removed |Added
Last reconfirmed||2025-01-16
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118489
Hongtao Liu changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118508
Bug ID: 118508
Summary: 10% performance drop when enabling autofdo for
spec2017 554.roms_r
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115777
--- Comment #8 from Hongtao Liu ---
> in backend costing we do anticipate the vector construction to happen
> by loading from memory though, so we don't account for the extra
> GPR->xmm move penalty.
Yes, I saw something similar before and had
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118940
--- Comment #9 from Hongtao Liu ---
>
> > Because I think the operands usage is broken.
>
> Additionally, by removing the do{ ... } while(0) wrap from
> bigint_test_exec(), the issue disappears. I believe that if it is the
> operands usage is
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118940
--- Comment #11 from Hongtao Liu ---
(In reply to Miao Wang from comment #10)
> (In reply to Hongtao Liu from comment #9)
> > >
> > > > Because I think the operands usage is broken.
> > >
> > > Additionally, by removing the do{ ... } while(0)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118802
--- Comment #22 from Hongtao Liu ---
(In reply to Sam James from comment #16)
> Bisected to r15-7400-gd3ff498c478ace (not CCing anyone yet as not enough
> useful information).
There's a new patch in [1] which will revert the commit and may fix
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118994
--- Comment #6 from Hongtao Liu ---
(In reply to John Platts from comment #5)
> GCC also fails to optimize (a | b) - ((a ^ b) >> 1) down to a single SSE2
> PAVGB/PAVGW, NEON/SVE2 SRHADD/URHADD, AltiVec
> vavgsb/vavgsh/vavgsw/vavgub/vavguh/vavguw
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118992
--- Comment #9 from Hongtao Liu ---
(In reply to H.J. Lu from comment #8)
> (In reply to Richard Biener from comment #7)
>
> >
> > >else if (targetm.small_register_classes_for_mode_p (GET_MODE (x)))
> > > record = false;
> > >
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118996
--- Comment #1 from Hongtao Liu ---
Looking at the hook description, it looks like x86 still need nozero return
values under apx (due to AREG, DREG, CREG, BREG, SIREG, DIREG)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118994
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118996
--- Comment #3 from Hongtao Liu ---
Original commit is added to avoid reload failure ~24 years ago, maybe we can
try to remove the check in cse.cc.
commit 8bf4dfc24f1957b8f645e362e354655fb851fc89
Author: Geoffrey Keating
Date: Mon Jul 2 23:2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118992
--- Comment #13 from Hongtao Liu ---
(In reply to H.J. Lu from comment #11)
> Created attachment 60590 [details]
> A patch
>
> Can you try this on SPEC CPU?
No big impact for both O2 and Ofast on SPEC2017.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117069
--- Comment #6 from Hongtao Liu ---
It looks like the testcase is fragile, it's supposed to check the compiler
ability of generating code_6_gottpoff_reloc instruction, but failed since
there's a seg_prefixed memory usage(r14-6242-gd564198f960a2f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118753
Bug 118753 depends on bug 117069, which changed state.
Bug 117069 Summary: [15 Regression] gcc.target/i386/apx-ndd-tls-1b.c since
r15-268-g9dbff9c05520a7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117069
What|Removed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118992
--- Comment #12 from Hongtao Liu ---
(In reply to H.J. Lu from comment #11)
> Created attachment 60590 [details]
> A patch
>
> Can you try this on SPEC CPU?
Sure.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117069
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118753
Bug 118753 depends on bug 117069, which changed state.
Bug 117069 Summary: [15 Regression] gcc.target/i386/apx-ndd-tls-1b.c since
r15-268-g9dbff9c05520a7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117069
What|Removed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117069
Hongtao Liu changed:
What|Removed |Added
Status|RESOLVED|REOPENED
Resolution|FIXED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118994
--- Comment #7 from Hongtao Liu ---
diff --git a/gcc/match.pd b/gcc/match.pd
index 5c679848bdf..d6a465c963c 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -11348,3 +11348,28 @@ and,
}
(if (full_perm_p)
(vec_perm (op@3 @0 @
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117081
--- Comment #20 from Hongtao Liu ---
>
> W/o more usage of callee-saved registers, callee needs to restore them
> before exit which is not needed if more caller-saved register are used.
W/ https://gcc.gnu.org/pipermail/gcc-patches/2025-Februa
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117081
--- Comment #19 from Hongtao Liu ---
(In reply to H.J. Lu from comment #18)
> (In reply to Haochen Jiang from comment #17)
> >
> > For reproduce, not only on ADL, the fix patch showed regression on all
> > Cascade Lake/Ice Lake/Sapphire Rapids w
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119083
--- Comment #5 from Hongtao Liu ---
(In reply to H.J. Lu from comment #3)
> Created attachment 60640 [details]
> A patch to remove SSE_FIRST_REG from ix86_class_likely_spilled_p
>
> Hongtao, can you measure its impact on SPEC CPU2017?
Sure.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119083
--- Comment #7 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #5)
> (In reply to H.J. Lu from comment #3)
> > Created attachment 60640 [details]
> > A patch to remove SSE_FIRST_REG from ix86_class_likely_spilled_p
> >
> > Hongtao, c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118996
--- Comment #14 from Hongtao Liu ---
(In reply to H.J. Lu from comment #13)
> (In reply to H.J. Lu from comment #11)
> > Created attachment 60609 [details]
> > An untested patch
>
> Hongtao, do you have SPEC CPU2017 data on this patch?
I haven
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118996
--- Comment #16 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #14)
> (In reply to H.J. Lu from comment #13)
> > (In reply to H.J. Lu from comment #11)
> > > Created attachment 60609 [details]
> > > An untested patch
> >
> > Hongtao
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119083
--- Comment #9 from Hongtao Liu ---
(In reply to H.J. Lu from comment #8)
> Created attachment 60647 [details]
> A patch to remove CREG and BREG from ix86_class_likely_spilled_p
>
> Hongtao, can you measure its impact on SPEC CPU 2017?
Ok.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119142
--- Comment #6 from Hongtao Liu ---
(In reply to Haochen Jiang from comment #5)
> (In reply to Haochen Jiang from comment #4)
> > I suppose that patch should be reverted, caused by Richard S's patch.
> >
> > https://gcc.gnu.org/pipermail/gcc-re
401 - 500 of 543 matches
Mail list logo