https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112904
Hongtao Liu changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112943
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111972
--- Comment #20 from Hongtao Liu ---
(In reply to Andrew Pinski from comment #19)
> Fixed.
Thanks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112891
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112962
--- Comment #12 from Hongtao Liu ---
(In reply to Jakub Jelinek from comment #8)
> Of course, yet another option is:
> --- gcc/config/i386/i386.cc 2023-12-12 08:54:39.821148670 +0100
> +++ gcc/config/i386/i386.cc 2023-12-12 11:07:03.79528636
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112962
--- Comment #13 from Hongtao Liu ---
> I prefer this solution, that's what we did for blendvps case.
> I don't know either, just follow what we did before (with false) when
> folding builtins.
I mean when I was working on
r14-1145-g1ede03e2d043
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112992
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112992
--- Comment #4 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #3)
> I think we need to also guard SImode and DImode case under AVX2 when
> MODE_SIZE==256.
Since there's vbroadcastss only support m alternative under avx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112992
--- Comment #5 from Hongtao Liu ---
(In reply to Roger Sayle from comment #0)
> The following four functions should in theory all produce the same code:
>
> typedef unsigned long long v4di __attribute((vector_size(32)));
> typedef unsigned int
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112992
--- Comment #6 from Hongtao Liu ---
> Thoughts? Apologies if this is a dup. I'm happy to work up a patch if
> someone could advise on where best this should be fixed. Perhaps RTL's
> vec_duplicate could be canonicalized to the most appropriat
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112992
--- Comment #7 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #6)
> > Thoughts? Apologies if this is a dup. I'm happy to work up a patch if
> > someone could advise on where best this should be fixed. Perhaps RTL's
> > vec_duplica
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113039
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104401
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113078
Bug ID: 113078
Summary: [14 regression] reduction of cond_sub is not
vectorized.
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Prio
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113079
Bug ID: 113079
Summary: [x86] Fails to generate dot_prod instructions for
64-bit vector.
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113079
--- Comment #1 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #0)
> int
> foo (int n, unsigned char* p, char* pi)
> {
> int sum = 0;
> for (int i = 0; i != 8; i++)
> {
> sum += p[i] * pi[i];
> }
> return s
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113090
Bug ID: 113090
Summary: Suboptimal vector permuation for 64-bit vector.
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110966
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |INVALID
Status|WAITING
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113261
Bug ID: 113261
Summary: missing vectorization for dot_prod chain.
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113261
--- Comment #1 from Hongtao Liu ---
For foo1,
_99 = .REDUC_PLUS (vect_patt_79.51_97);
_90 = .REDUC_PLUS (vect_patt_28.43_88);
_19 = _90 + _99;
can be optimized to
_tmp = vect_patt_79.51_97 + vect_patt_28.43_88;
_19 = .REDUC_PLUS
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113288
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104401
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113312
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113345
Bug ID: 113345
Summary: miss optimization for psign{b,w,d}.
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113345
--- Comment #1 from Hongtao Liu ---
>
> maybe we can just refactor the pattern as blow, then combine can generate
> the pattern for us.
>
> 22115(define_insn "_psign3"
> 22116 [(set (match_operand:VI124_AVX2 0 "register_operand" "=x,x")
> 22
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113039
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113345
Hongtao Liu changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113126
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113437
--- Comment #2 from Hongtao Liu ---
Maybe we can add target vect_int.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113458
--- Comment #2 from Hongtao Liu ---
> But if we reduce n to 4, the loop based vectorizer is not able to handle it
> either.
Do we support 1 element vector(i.e V1SI) in vectorizer?
and it also relies on backend support of dot_prodv4qi.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113458
--- Comment #6 from Hongtao Liu ---
> thus
>
> vect__16.5_40 = MEM [(short int *)a_22(D)];
> vect__17.6_41 = (vector(4) int) vect__16.5_40;
> vect__18.9_44 = MEM [(signed char *)b_23(D)];
> vect_patt_36.10_45 = (vector(4) signed shor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113437
--- Comment #5 from Hongtao Liu ---
(In reply to Andrew Pinski from comment #3)
> (In reply to Hongtao Liu from comment #2)
> > Maybe we can add target vect_int.
>
> Not really because vect_int depends on the vect.exp framework still. See PR
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113437
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113539
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
Bug ID: 113576
Summary: [14 regression] 502.gcc_r hangs
r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Sever
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #1 from Hongtao Liu ---
int
__attribute__((noinline))
sbitmap_first_set_bit (const_sbitmap bmap)
{
unsigned int n = 0;
sbitmap_iterator sbi;
EXECUTE_IF_SET_IN_SBITMAP (bmap, 0, n, sbi)
return n;
return -1;
}
hangs on th
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #6 from Hongtao Liu ---
Another potential buggy place is
240 vexit_reduc_67 = mask_patt_43.28_62 & mask_patt_43.28_63;
241 if (vexit_reduc_67 == { -1, -1, -1, -1 })
242goto ; [94.50%]
243 else
is expanded to
319(insn 69
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113592
Bug ID: 113592
Summary: missed partial sum optimization in vectorizer
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113593
Bug ID: 113593
Summary: missed partial sum optimization in vectorizer
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113594
Bug ID: 113594
Summary: Missing partial sum optimziation in the vectorizer.
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Compone
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113592
--- Comment #1 from Hongtao Liu ---
*** Bug 113593 has been marked as a duplicate of this bug. ***
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113593
Hongtao Liu changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113594
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |DUPLICATE
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113592
--- Comment #2 from Hongtao Liu ---
*** Bug 113594 has been marked as a duplicate of this bug. ***
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113592
--- Comment #3 from Hongtao Liu ---
This testcase is probably not a good example for typical partail sum which
relies on unroll loops.
double
foo (double* p, int n)
{
double sum = 0;
for (int i = 0; i != n; i++)
sum += p[i] * p[i]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #7 from Hongtao Liu ---
diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 1fd957288d4..33a8d539b4d 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -8032,7 +8032,7 @@ native_encode_vector_part (const_tree expr, unsign
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #8 from Hongtao Liu ---
maybe
diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 1fd957288d4..6d321f9baef 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -8035,6 +8035,9 @@ native_encode_vector_part (const_tree exp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
Hongtao Liu changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
Hongtao Liu changed:
What|Removed |Added
Resolution|FIXED |---
Status|RESOLVED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600
--- Comment #1 from Hongtao Liu ---
Guess it's same issue as PR112879?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600
--- Comment #2 from Hongtao Liu ---
A patch is posted at
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640276.html
Would you give a try to see if it fixes the regression, I don't currently have
a znver4 machine for testing.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #20 from Hongtao Liu ---
> Note that I wonder how to eliminate redundant maskings? I suppose
> eventually combine tracking nonzero bits where obvious would do
> that? For example for cmp:V4SI we know the bits will be zero but
> I
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #21 from Hongtao Liu ---
typedef unsigned long mp_limb_t;
typedef long mp_size_t;
typedef unsigned long mp_bitcnt_t;
typedef mp_limb_t *mp_ptr;
typedef const mp_limb_t *mp_srcptr;
#define GMP_LIMB_BITS (sizeof(mp_limb_t) * 8)
#def
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113609
Bug ID: 113609
Summary: EQ/NE comparison between avx512 kmask and -1 can be
optimized with kxortest with checking CF.
Product: gcc
Version: 14.0
Status: UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113609
--- Comment #1 from Hongtao Liu ---
Since they're different modes, CCZ for cmp, but CCS for kortest, it could be
diffcult to optimize it in RA stage by adding alternatives(like we did for
compared to 0). So the easy way could be adding peephole
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #22 from Hongtao Liu ---
typedef unsigned long mp_limb_t;
typedef long mp_size_t;
typedef unsigned long mp_bitcnt_t;
typedef mp_limb_t *mp_ptr;
typedef const mp_limb_t *mp_srcptr;
#define GMP_LIMB_BITS (sizeof(mp_limb_t) * 8)
#def
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #25 from Hongtao Liu ---
(In reply to Tamar Christina from comment #24)
> Just to avoid confusion, are you still working on this one Richi?
I'm working on a patch to add a target hook as #c18 mentioned.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #28 from Hongtao Liu ---
I saw we already maskoff integral modes for vector mask in store_constructor
/* Use sign-extension for uniform boolean vectors with
integer modes and single-bit mask entries.
Ef
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600
--- Comment #5 from Hongtao Liu ---
It looks like x264_pixel_satd_16x16 consumes more time after my commit, an
extracted case is as below, note there's no attribute((always_inline)) in the
original x264_pixel_satd_8x4, it's added to force inline
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600
--- Comment #6 from Hongtao Liu ---
Guess explicit .REDUC_PLUS instead of original VEC_PERM_EXPR somehow impacts
the store split decision.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113656
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113729
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113744
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113729
--- Comment #2 from Hongtao Liu ---
extern unsigned char b;
int
foo (void)
{
return (unsigned char)(200 + b);
}
gcc -O2 -mapxf
foo():
subb $56, b(%rip), %al
movzbl %al, %eax
ret
And this can be optimzied to
foo():
subb $56, b(%ri
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115384
Hongtao Liu changed:
What|Removed |Added
Status|NEW |ASSIGNED
--- Comment #3 from Hongtao Liu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406
Hongtao Liu changed:
What|Removed |Added
Status|NEW |ASSIGNED
--- Comment #2 from Hongtao Liu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406
--- Comment #3 from Hongtao Liu ---
typedef __attribute__((__vector_size__ (1))) char V;
char
foo (V v)
{
return ((V) v == v)[0];
}
int
main ()
{
char x = foo ((V) { });
if (x != -1)
__builtin_abort ();
}
w/ vcond_mask_qiqi, it's no
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406
--- Comment #4 from Hongtao Liu ---
>
> and for _2 = VIEW_CONVERT_EXPR(_1); we explicitly
> clear the upper bits due to PR113576, and then we get 1 hit the abort.
It's not VIEW_CONVERT_EXPR clear the uppper bits, but _1 = { -1 };
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406
--- Comment #5 from Hongtao Liu ---
> _2 = VEC_COND_EXPR <_1, { -1 }, { 0 }>;
Hmm, it should check vcond_mask_qiv1qi instead of vcond_mask_qiqi, I guess
since the backend doesn't supports v1qi, TYPE_MODE of V is QImode, then it
wrongly checke
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406
--- Comment #6 from Hongtao Liu ---
For 1 element vector, when backend doesn't support it's vector mode, the scalar
mode is used for the type, which makes expand_vec_cond_expr_p use QImode for
icode check.(vcond_mask_qiqi)
It could also be the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115365
Hongtao Liu changed:
What|Removed |Added
Target|powerpc64le-linux-gnu, |powerpc64le-linux-gnu,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115418
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115365
--- Comment #7 from Hongtao Liu ---
+/* { dg-final { scan-rtl-dump-times {(?n)^(?!.*REG_EQUIV)(?=.*\(fix:SI)} 3
"final" } } */
Does this fix the testcase on solaris2?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115384
Hongtao Liu changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115452
Bug ID: 115452
Summary: ICE when dump stv2 for gcc.target/i386/pr70322-2.c
with -march=cascadelake
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: nor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115452
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115462
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115463
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115021
--- Comment #5 from Hongtao Liu ---
It's fixed by r15-1100-gec985bc97a0157
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517
Bug ID: 115517
Summary: Fix regression after dropping uses of
vcond{,u,eq}_optab
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: normal
Prio
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517
--- Comment #2 from Hongtao Liu ---
(In reply to Richard Biener from comment #1)
> Btw, I had opened PR115490 with my results for this already. Some mitigation
> should be from optimizing ISEL expansion to vcond_mask and I'd start with
> lookin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517
--- Comment #4 from Hongtao Liu ---
(In reply to rguent...@suse.de from comment #3)
> On Tue, 18 Jun 2024, liuhongt at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517
> >
> > --- Comment #2 from Hongtao Liu --
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517
--- Comment #6 from Hongtao Liu ---
(In reply to rguent...@suse.de from comment #5)
> On Tue, 18 Jun 2024, liuhongt at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517
> >
> > --- Comment #4 from Hongtao Liu --
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406
--- Comment #7 from Hongtao Liu ---
>
> BTW, when assign -1 to vector(1) , should the upper bit be
> cleared? Look like only 1 element boolean vector is cleared, but not
> vector(2) .
> If the upper bits are not cleared, both 2 cases are equal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115610
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
Last reconf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115450
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 115462, which changed state.
Bug 115462 Summary: [15 regression] 416.gamess regressed 4-6% on x86_64 since
r15-882-g1d6199e5f8c1c0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115462
What|Removed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115462
Hongtao Liu changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115683
Bug ID: 115683
Summary: SSE2 regressions after obselete of vcond{,u,eq}.
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115675
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115693
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115610
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517
--- Comment #14 from Hongtao Liu ---
regressions above SSE4.1 are fxed in GCC15, SSE2 regressions are tracked in
PR115683
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114189
Bug 114189 depends on bug 115517, which changed state.
Bug 115517 Summary: Fix x86 regressions after dropping uses of
vcond{,u,eq}_optab
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517
What|Removed |Added
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107432
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
CC|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115748
Hongtao Liu changed:
What|Removed |Added
Ever confirmed|0 |1
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109812
--- Comment #23 from Hongtao Liu ---
(In reply to edison from comment #22)
> for 607.cactuBSSN_s,if use preENV_GOMP_CPU_AFFINITY = 0-23 in CPU2017 .cfg,
> all p-core(i9-13900k) usage will down to 15%(the e-core almost 100%), if
> comment out it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115748
Hongtao Liu changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115756
--- Comment #3 from Hongtao Liu ---
Current rtx_cost for imulq in generic_cost is COST_N_INSNS (4), make it as
COST_N_INSNS (3) could generate imulq.
{COSTS_N_INSNS (3), /* cost of starting multiply for QI */
COSTS_N_INS
201 - 300 of 586 matches
Mail list logo