https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115755
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115749
Hongtao Liu changed:
What|Removed |Added
CC||haochen.jiang at intel dot com,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115796
Hongtao Liu changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115115
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113733
Hongtao Liu changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113312
--- Comment #28 from Hongtao Liu ---
__attribute__((no_callee_saved_registers)) is added in GCC14.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113312
Hongtao Liu changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115833
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115833
Hongtao Liu changed:
What|Removed |Added
CC||lin1.hu at intel dot com
--- Comment #4 f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115842
Hongtao Liu changed:
What|Removed |Added
Last reconfirmed||2024-07-11
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115872
Hongtao Liu changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |liuhongt at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115842
Hongtao Liu changed:
What|Removed |Added
Status|ASSIGNED|UNCONFIRMED
Ever confirmed|1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115842
--- Comment #3 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #2)
> Bisected to r15-1673-gb8153b5417bed0, the commit fixed wrong rtx_cost of
> r15-882-g1d6199e5f8c1c0 which happened to improved 548.exchange_r.
Looks like wrong rtx_c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 115889, which changed state.
Bug 115889 Summary: [15 Regression] FAIL: gcc.dg/vect/vect-vfa-03.c execution
test with -march=znver4 --param vect-partial-vector-usage=1 since
r15-1368-g6d0b7b69d14302
https://gcc.gnu.org/bu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115889
Hongtao Liu changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115872
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115843
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115843
--- Comment #10 from Hongtao Liu ---
> But using kmovw for QImode mask is not correct as we don't know the value in
> gpr. Perhaps we'd consider restrict the kmovb under avx512dq only.
Why? as long as we only care about lower 8 bits, vmovw sho
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113733
Bug 113733 depends on bug 113711, which changed state.
Bug 113711 Summary: APX instruction set and instructions longer than 15 bytes
(assembly warning)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113711
What|Removed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113711
Hongtao Liu changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115863
Hongtao Liu changed:
What|Removed |Added
CC||lin1.hu at intel dot com
--- Comment #16
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114966
--- Comment #5 from Hongtao Liu ---
I saw pass_eras optimize BIT_FIELD_REF of big memory into load from small
memory
Created a replacement for D.161366 offset: 0, size: 64: SR.20D.170101
Created a replacement for D.161366 offset: 64, size: 64:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115978
--- Comment #4 from Hongtao Liu ---
To clarify, the question originally came from whether or not to report error
for -m32,-march=native, and then LLVM folks said it's diffcult for LLVM not
issuing error for -march=native -m32, but issuing error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115978
--- Comment #6 from Hongtao Liu ---
(In reply to H.J. Lu from comment #5)
> (In reply to Hongtao Liu from comment #4)
> > To clarify, the question originally came from whether or not to report error
> > for -m32,-march=native, and then LLVM folk
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115994
Bug ID: 115994
Summary: Vectorizer failed to do vectorizaton for .sat_trunc
when nunits_in / nunits_out > 2
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Seve
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115994
--- Comment #1 from Hongtao Liu ---
Also in vect_recog_sat_trunc_pattern
4700 tree v_itype = get_vectype_for_scalar_type (vinfo, itype);
4701 tree v_otype = get_vectype_for_scalar_type (vinfo, otype);
4702 internal_fn fn = IFN_S
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115982
Hongtao Liu changed:
What|Removed |Added
Status|NEW |ASSIGNED
Assignee|unassigned at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115982
--- Comment #5 from Hongtao Liu ---
Fixed by r15-2217-ga3f03891065cb9, could be latent on release branch since
GCC12
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116043
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116043
Hongtao Liu changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |liuhongt at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116064
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856
--- Comment #48 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #47)
> Created attachment 58746 [details]
> Accoate v2di with GPR
>
> The attached patch can allocated V2DI with GPR to avoid spill.
>
@Uros Is it a good idea to make G
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115978
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|WAITING
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115978
--- Comment #10 from Hongtao Liu ---
(In reply to H.J. Lu from comment #9)
> (In reply to Hongtao Liu from comment #8)
> > Fixed in GCC15,thanks H.J.
>
> Does GCC 14 have the same issue with -m32 -march=native?
Yes, will backport the patch.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96846
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116096
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116096
Hongtao Liu changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |liuhongt at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116096
--- Comment #3 from Hongtao Liu ---
>
> (define_insn "ashl3_doubleword"
>[(set (match_operand:DWI 0 "register_operand" "=&r,&r")
> - (ashift:DWI (match_operand:DWI 1 "reg_or_pm1_operand" "0n,r")
> + (ashift:DWI (match_operand:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116122
Hongtao Liu changed:
What|Removed |Added
Status|UNCONFIRMED |ASSIGNED
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117072
--- Comment #12 from Hongtao Liu ---
>
> So the backend fix should at least add 8 patterns to handle that, in that
> case, maybe the middle-end canonicalization would be better.
And I will still submit a patch to make the FMA predicates more
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117072
--- Comment #11 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #10)
> (In reply to rguent...@suse.de from comment #9)
> > On Fri, 11 Oct 2024, liuhongt at gcc dot gnu.org wrote:
> >
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117116
Hongtao Liu changed:
What|Removed |Added
Assignee|liuhongt at gcc dot gnu.org|uros at gcc dot gnu.org
--- Commen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79786
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117081
--- Comment #5 from Hongtao Liu ---
*** Bug 117082 has been marked as a duplicate of this bug. ***
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117082
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |DUPLICATE
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117116
Hongtao Liu changed:
What|Removed |Added
Last reconfirmed||2024-10-14
Ever confirmed|0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117116
--- Comment #3 from Hongtao Liu ---
A simple testcase
typedef long long v4di __attribute__((vector_size(32)));
v4di
foo (long long a)
{
return __extension__(v4di){(long long)foo, 1, 1, 1};
}
reproduced with -O2 -mavx2, failed at least sin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117116
--- Comment #4 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #3)
> A simple testcase
>
> typedef long long v4di __attribute__((vector_size(32)));
>
> v4di
> foo (long long a)
> {
> return __extension__(v4di){(long long)foo, 1,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117072
--- Comment #10 from Hongtao Liu ---
(In reply to rguent...@suse.de from comment #9)
> On Fri, 11 Oct 2024, liuhongt at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117072
> >
> > --- Comment #8 from Hongtao Liu -
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117088
Bug ID: 117088
Summary: [15 regression] 548.exchange_r regressed by 10% with
-O2 -march=x86-64-v3 after enhance O2 vectorization
Product: gcc
Version: 15.0
Status: UNCON
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117159
--- Comment #2 from Hongtao Liu ---
typedef __attribute__((__vector_size__ (4))) unsigned char W;
typedef __attribute__((__vector_size__ (64))) int V;
typedef __attribute__((__vector_size__ (64))) long long Vq;
W w;
V v;
Vq vq;
static inline W
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117159
Hongtao Liu changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117232
Hongtao Liu changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |liuhongt at gcc dot
gnu.org
Las
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117055
Bug ID: 117055
Summary: [meta-bug] GCC15 O2 vectorization enhancement
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116940
Hongtao Liu changed:
What|Removed |Added
Status|NEW |ASSIGNED
Assignee|unassigned at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101017
--- Comment #11 from Hongtao Liu ---
(In reply to David Binderman from comment #10)
> Did this ever happen ?
>
> Similar test case gcc/testsuite/gcc.target/i386/avx10_1-26.c
> still seems to cause a crash:
>
> testsuite $ ~/gcc/results/bin/gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117232
--- Comment #4 from Hongtao Liu ---
(In reply to Andrew Pinski from comment #0)
> This is expansion of PR 113609 which showed when I improved phiopt's factor
> operations to handle more than just 1 operand operations.
>
> New reduced testcase t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117232
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64700
Bug 64700 depends on bug 117232, which changed state.
Bug 117232 Summary: EQ/NE comparison between avx512 kmask and -1 can be
optimized with kxortest with checking CF when using cmov
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117232
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117072
--- Comment #8 from Hongtao Liu ---
(In reply to Richard Biener from comment #7)
> OTOH I'll note that no other simplify_* treats canonicalization as
> simplification and the existing swap_commutative_operands_p transform for FMA
> is highly unc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116064
--- Comment #11 from Hongtao Liu ---
(In reply to Richard Biener from comment #10)
> So - fixed?
Yes.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117116
--- Comment #2 from Hongtao Liu ---
Looks like it just expose an backend bug, I'll take a look.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117240
Hongtao Liu changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |liuhongt at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117301
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117416
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117407
Hongtao Liu changed:
What|Removed |Added
CC||zsojka at seznam dot cz
--- Comment #5 fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117416
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |DUPLICATE
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117416
Hongtao Liu changed:
What|Removed |Added
Resolution|DUPLICATE |---
Status|RESOLVED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117438
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117304
--- Comment #4 from Hongtao Liu ---
$: grep AVX512F i386-builtin.def | grep -v EVEX512 | grep -e V8DI -e V8DF -e
V16SI -e V16SF -e V32HI -e V32HF -e V32BF -e V64QI
BDESC (OPTION_MASK_ISA_AVX512F, 0,
CODE_FOR_unspec_fix_truncv8dfv8si2_mask_roun
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117416
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|REOPENED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117438
--- Comment #4 from Hongtao Liu ---
(In reply to Mayshao-oc from comment #0)
> Created attachment 59530 [details]
> gcc -O1 loop.c
>
> Pass_align_tight_loops align the inner loop aggressively, this may cause
> significant performance regression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117318
Hongtao Liu changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117323
--- Comment #6 from Hongtao Liu ---
(In reply to Andrew Pinski from comment #5)
> Note the reasoning for the difference in arguments between aarch64 and
> x86_64 is that x86_64 defines PUSH_ARGS_REVERSED to be 1.
Interesting define min/max as m
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117323
Bug ID: 117323
Summary: GCC failed to optimize value / 128 to value >> 7 when
the range of value must be positive
Product: gcc
Version: 15.0
Status: UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117318
Hongtao Liu changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |liuhongt at gcc dot
gnu.org
E
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117542
--- Comment #2 from Hongtao Liu ---
(In reply to Richard Biener from comment #1)
> It doesn't even unambiguously specify whether the mode is that of the source
> or the destination. The original idea was of course that the size
> unambiguously
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117542
Bug ID: 117542
Summary: Missed loop vectorization for truncate from float to
__bf16.
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Keywords: missed-optimizati
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117240
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117301
--- Comment #3 from Hongtao Liu ---
yes, new instructions are still under review for binutils and not landed on
Binutil trunk, but GCC check_effective_target_avx10_2 target with "old"
_mm256_mask_vpdpbssd_epi32.
The problem should be gone when
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117323
--- Comment #4 from Hongtao Liu ---
Another miss optimization is GCC failed to recognize max_expr for sum1, which
generates a lot pack/unpack code in the vectorizer
prephitmp_66 = (int) _8;
# DEBUG a => NULL
# DEBUG b => NULL
# DEBUG a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116765
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116800
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116738
--- Comment #11 from Hongtao Liu ---
Sure.
> Hongtao, can you please take the patch forward?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117159
Hongtao Liu changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |liuhongt at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116940
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115365
Hongtao Liu changed:
What|Removed |Added
Status|REOPENED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117072
Hongtao Liu changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 117072, which changed state.
Bug 117072 Summary: [15 Regression] FAIL:
gcc.target/i386/cond_op_fma_{float,double,_Float16}-1.c since
r15-3509-gd34cda72098867
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117072
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117438
--- Comment #5 from Hongtao Liu ---
I reproduce with 30% regression on CLX, there's more frontend-bound
with aligned case, it's uarch specific, will make it a uarch tune.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117304
Hongtao Liu changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117839
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=73350
Hongtao Liu changed:
What|Removed |Added
Status|NEW |RESOLVED
CC|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80862
Bug 80862 depends on bug 73350, which changed state.
Bug 73350 Summary: AVX512: GCC optimizes away rounding flags
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=73350
What|Removed |Added
--
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117562
--- Comment #10 from Hongtao Liu ---
>
> I do wonder about the usefulness of the memory alternative on the
> sse_movhlps pattern though, there's the sse_storehps pattern which
> also models the store part more precisely as V2SFmode. Is
> sse_
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117823
--- Comment #1 from Hongtao Liu ---
The vectorization maybe need ffast-math.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117823
Bug ID: 117823
Summary: sdot_prod pattern extended to floating point?
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: mi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116675
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|REOPENED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 116675, which changed state.
Bug 116675 Summary: No blend constant permute for V8HImode with just SSE2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116675
What|Removed |Added
---
301 - 400 of 543 matches
Mail list logo