https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92295
--- Comment #2 from liuhongt at gcc dot gnu.org ---
Author: liuhongt
Date: Fri Nov 8 05:34:25 2019
New Revision: 277946
URL: https://gcc.gnu.org/viewcvs?rev=277946&root=gcc&view=rev
Log:
Fix inefficient vector constructor.
Chang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92448
--- Comment #3 from liuhongt at gcc dot gnu.org ---
Author: liuhongt
Date: Mon Nov 18 02:22:55 2019
New Revision: 278385
URL: https://gcc.gnu.org/viewcvs?rev=278385&root=gcc&view=rev
Log:
Split X86_TUNE_AVX128_OPTI
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92686
--- Comment #5 from liuhongt at gcc dot gnu.org ---
Author: liuhongt
Date: Mon Dec 9 04:16:24 2019
New Revision: 279107
URL: https://gcc.gnu.org/viewcvs?rev=279107&root=gcc&view=rev
Log:
Enable mask movement for VCOND_EXPR under avx512f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92865
--- Comment #7 from liuhongt at gcc dot gnu.org ---
Author: liuhongt
Date: Wed Dec 11 08:06:06 2019
New Revision: 279214
URL: https://gcc.gnu.org/viewcvs?rev=279214&root=gcc&view=rev
Log:
Fix unrecognizable insn of pr92865.
gcc/
P
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92807
--- Comment #6 from liuhongt at gcc dot gnu.org ---
Author: liuhongt
Date: Tue Dec 17 01:29:09 2019
New Revision: 279451
URL: https://gcc.gnu.org/viewcvs?rev=279451&root=gcc&view=rev
Log:
Use add for a = a + b and a = b + a when possibl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92651
--- Comment #9 from liuhongt at gcc dot gnu.org ---
Author: liuhongt
Date: Tue Dec 17 01:50:35 2019
New Revision: 279452
URL: https://gcc.gnu.org/viewcvs?rev=279452&root=gcc&view=rev
Log:
Add abs pattern to handle {si,di} mode abs to av
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89750
--- Comment #3 from liuhongt at gcc dot gnu.org ---
Author: liuhongt
Date: Mon Jun 3 02:20:33 2019
New Revision: 271853
URL: https://gcc.gnu.org/viewcvs?rev=271853&root=gcc&view=rev
Log:
2019-05-06 H.J. Lu
Hon
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86444
--- Comment #2 from liuhongt at gcc dot gnu.org ---
Author: liuhongt
Date: Mon Jun 3 02:20:33 2019
New Revision: 271853
URL: https://gcc.gnu.org/viewcvs?rev=271853&root=gcc&view=rev
Log:
2019-05-06 H.J. Lu
Hon
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89803
--- Comment #7 from liuhongt at gcc dot gnu.org ---
Author: liuhongt
Date: Wed Jun 5 06:04:22 2019
New Revision: 271946
URL: https://gcc.gnu.org/viewcvs?rev=271946&root=gcc&view=rev
Log:
gcc/
2019-06-05 Hongtao Liu
PR targ
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #37 from Hongtao Liu ---
(In reply to Richard Biener from comment #36)
> For example with AVX512VL and the following, using -O -fgimple -mavx512vl
> we get simply
>
> notl%esi
> orl %esi, %edi
> cmpb
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #38 from Hongtao Liu ---
> I think we should also mask off the upper bits of variable mask?
>
> notl%esi
> orl %esi, %edi
> notl%edi
> andl$15, %edi
> je .L3
with -mbmi,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #39 from Hongtao Liu ---
> > the question is whether that matches the semantics of GIMPLE (the padding
> > is inverted, too), whether it invokes undefined behavior (don't do it - it
> > seems for people using intrinsics that's what i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #43 from Hongtao Liu ---
> Well, yes, the discussion in this bug was whether to do this at consumers
> (that's sth new) or with all mask operations (that's how we handle
> bit-precision integer operations, so it might be relatively
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #44 from Hongtao Liu ---
>
> Note the AND is removed by combine if I add it:
>
> Successfully matched this instruction:
> (set (reg:CCZ 17 flags)
> (compare:CCZ (and:HI (not:HI (subreg:HI (reg:QI 102 [ tem_3 ]) 0))
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #45 from Hongtao Liu ---
> > There's do_store_flag to fixup for uses not in branches and
> > do_compare_and_jump for conditional jumps.
>
> reasonable enough for me.
I mean we only handle it at consumers where upper bits matters.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #57 from Hongtao Liu ---
> For dg-do run testcases I really think we should avoid those -march=
> options, because it means a lot of other stuff, BMI, LZCNT, ...
Make sense.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109885
--- Comment #4 from Hongtao Liu ---
int sum() {
int ret = 0;
for (int i=0; i<8; ++i) ret +=(0==v[i]);
return ret;
}
int sum2() {
int ret = 0;
auto m = v==0;
for (int i=0; i<8; ++i) ret += m[i];
return ret;
}
For sum, gcc t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114107
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114107
--- Comment #8 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #7)
> perm_cost is very low in x86 backend, and it maybe ok for 128-bit vectors,
> pshufb/shufps are avaible for most cases.
> But for 256/512-bit vectors, when the permua
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114107
--- Comment #11 from Hongtao Liu ---
(In reply to N Schaeffer from comment #9)
> In addition, optimizing for size with -Os leads to a non-vectorized
> double-loop (51 bytes) while the vectorized loop with vbroadcastsd (produced
> by clang -Os) l
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325
--- Comment #9 from Hongtao Liu ---
The original case is a little different from the one in PR.
It comes from ggml
#include
#include
typedef uint16_t ggml_fp16_t;
static float table_f32_f16[1 << 16];
inline static float ggml_lookup_fp16_to_
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325
--- Comment #10 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #9)
> The original case is a little different from the one in PR.
But the issue is similar, after cunrolli, GCC failed to vectorize the outer
loop.
The interesting thing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325
--- Comment #11 from Hongtao Liu ---
>Loop body is likely going to simplify further, this is difficult
>to guess, we just decrease the result by 1/3. */
>
This is introduced by r0-68074-g91a01f21abfe19
/* Estimate number of insns of
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: liuhongt at gcc dot gnu.org
Target Milestone: ---
Quote from https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646587.html
> On Linux/x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114125
Hongtao Liu changed:
What|Removed |Added
Ever confirmed|0 |1
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325
--- Comment #14 from Hongtao Liu ---
(In reply to rguent...@suse.de from comment #13)
> On Tue, 27 Feb 2024, liuhongt at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325
> >
> > --- Comm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325
--- Comment #16 from Hongtao Liu ---
> I'm all for removing the 1/3 for innermost loop handling (in cunroll
> the unrolled loop is then innermost). I'm more concerned about
> unrolling more than one level which is exactly what's required for
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114164
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114171
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
Last
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111822
--- Comment #16 from Hongtao Liu ---
(In reply to Uroš Bizjak from comment #11)
> (In reply to Richard Biener from comment #10)
> > The easiest fix would be to refuse applying STV to a insn that
> > can_throw_internal () (that's an insn that has
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110027
--- Comment #12 from Hongtao Liu ---
(In reply to Sam James from comment #11)
> Calling it a 11..14 regression as we know 14 is bad and 7.5 is OK, but I
> can't test 11/12 on an avx512 machine right now.
I can't reproduce that with 11/12, but w
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110027
--- Comment #13 from Hongtao Liu ---
So the stack is like
--- stack top
-32
- (offset -32)
-64 (32 bytes redzone)
- (offset -64)
-128 (64 bytes __m512)
(offset -128)
(32-bytes redzone)
---(offset -1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111731
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110027
--- Comment #14 from Hongtao Liu ---
diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index 0de299c62e3..92062378d8e 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -1214,7 +1214,7 @@ expand_stack_vars (bool (*pred) (size_t), class
stack
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111822
Hongtao Liu changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110027
--- Comment #15 from Hongtao Liu ---
A patch is posted at
https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647604.html
||2024-03-15
CC||liuhongt at gcc dot gnu.org
Ever confirmed|0 |1
--- Comment #1 from Hongtao Liu ---
Mine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66862
--- Comment #5 from Hongtao Liu ---
> Now, it seems AVX512BW (and AVX512VL in some cases) has the needed
> instructions,
> in particular VMOVDQU{8,16}, but it is not reflected in maskload and
> maskstore expanders. CCing Kyrill and Uros on this.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114334
Hongtao Liu changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114347
--- Comment #9 from Hongtao Liu ---
(In reply to Richard Biener from comment #7)
> (In reply to Jakub Jelinek from comment #6)
> > You can use -fexcess-precision=16 if you don't want treating _Float16 and
> > __bf16 as having excess precision.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67683
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114396
--- Comment #15 from Hongtao Liu ---
(In reply to Richard Biener from comment #9)
> (In reply to Robin Dapp from comment #8)
> > No fallout on x86 or aarch64.
> >
> > Of course using false instead of TYPE_SIGN (utype) is also possible and
> > m
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114396
Hongtao Liu changed:
What|Removed |Added
Status|NEW |ASSIGNED
--- Comment #16 from Hongtao Liu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114396
--- Comment #17 from Hongtao Liu ---
> >
> > The to_mpz args look like they could be mixing signs as well:
> >
I tries below, looks like mixing signs works well.
debug show step_expr is -5 and signed.
short a = 0xF;
short b[16];
unsigned shor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92080
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92080
--- Comment #9 from Hongtao Liu ---
> If we were to expose that vpxor before postreload we'd likely CSE but
> we have
>
> 5: xmm0:V4SI=const_vector
> REG_EQUIV const_vector
> 6: [`b']=xmm0:V4SI
> 7: xmm0:V8HI=const_vector
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114396
--- Comment #20 from Hongtao Liu ---
(In reply to JuzheZhong from comment #19)
> I think it's better to add pr114396.c into vect testsuite instead of x86
> target test since it's the bug not only happens on x86.
Sure, there's no target specific
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114396
Hongtao Liu changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: liuhongt at gcc dot gnu.org
Target Milestone: ---
void
foo (int* a, short* __restrict b, int* c)
{
for (int i = 0; i != 8; i++)
b[i] = c[i] + a[i
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: liuhongt at gcc dot gnu.org
Target Milestone: ---
typedef unsigned short uint16_t;
typedef short int16_t;
#define QUANT_ONE( coef, mf, f
ority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: liuhongt at gcc dot gnu.org
Target Milestone: ---
typedef unsigned char uint8_t;
uint8_t x264_clip_uint8( int x )
{
return x&(~255) ? (-x)>>31 : x;
}
void
foo (int* a, int* __restrict b, in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114429
Hongtao Liu changed:
What|Removed |Added
Target||x86_64-*-* i?86-*-*
--- Comment #1 from H
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114429
--- Comment #2 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #1)
> when x is INT_MIN, I assume -x is UD, so compiler can do anything.
> otherwise, (-x) >> 31 is just x > 0.
> From rtl view. neg of INT_MIN is assumed to 0 after it's
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114429
Hongtao Liu changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114471
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114471
--- Comment #6 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #5)
> Maybe we should always use kmask under AVX512, currently only >= 128-bits
> vector of vector _Float16 use kmask, < 128 bits vector still use vector mask.
>
and we n
ponent: target
Assignee: unassigned at gcc dot gnu.org
Reporter: liuhongt at gcc dot gnu.org
Target Milestone: ---
v16qi
foo2 (v16qi a, v16qi b)
{
return a >> 7;
}
it can be optimized with
vpxor xmm1, xmm1, xmm1
vpcmpgtbxmm0, xmm1, xmm0
re
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114514
--- Comment #3 from Hongtao Liu ---
(In reply to Andrew Pinski from comment #1)
> Confirmed.
>
> Note non sign bit can be improved too:
> ```
I assume you're talking about broadcast from imm or directly from constant
pool. GCC chooses the forme
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: liuhongt at gcc dot gnu.org
Target Milestone: ---
typedef __uint128_t v128_t __attribute__((vector_size(16)));
v128_t c;
v128_t
foo1 (v128_t *a, v128_t *b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114544
--- Comment #1 from Hongtao Liu ---
20590;; Turn SImode or DImode extraction from arbitrary SSE/AVX/AVX512F
20591;; vector modes into vec_extract*.
20592(define_split
20593 [(set (match_operand:SWI48x 0 "nonimmediate_operand")
20594(sub
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114544
--- Comment #2 from Hongtao Liu ---
Also for
void
foo2 (v128_t* a, v128_t* b)
{
c = (*a & *b)+ *b;
}
(insn 9 8 10 2 (set (reg:V1TI 108 [ _3 ])
(and:V1TI (reg:V1TI 99 [ _2 ])
(mem:V1TI (reg:DI 113) [1 *a_6(D)+0 S16 A128])
ormal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: liuhongt at gcc dot gnu.org
Target Milestone: ---
v32qi
z (void* pa, void* pb, void* pc)
{
v32qi __attribute__((aligned(64))) a;
v32qi __attrib
IRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: liuhongt at gcc dot gnu.org
Target Milestone: ---
typedef float v128_32 __attribute__((vector_size (128 * 4), aligned(2048)));
v128_32
foo (v128_32 a, v128
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113744
Hongtao Liu changed:
What|Removed |Added
Status|UNCONFIRMED |ASSIGNED
Assignee|unassigned at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116157
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85236
Hongtao Liu changed:
What|Removed |Added
CC||binklings at 163 dot com
--- Comment #8 fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116122
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115981
--- Comment #4 from Hongtao Liu ---
(In reply to Jakub Jelinek from comment #3)
> Created attachment 58786 [details]
> gcc15-pr115981.patch
>
> Untested fix. As since that commit it checks swap_commutative_operands_p:
> 1) CONST_VECTOR I think
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113744
Hongtao Liu changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
||12.1.0
CC||liuhongt at gcc dot gnu.org
Status|NEW |RESOLVED
--- Comment #6 from Hongtao Liu ---
Fixed in GCC12 and above.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116096
Hongtao Liu changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115021
Hongtao Liu changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116274
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116274
--- Comment #5 from Hongtao Liu ---
For non-avx case, looks like it hits here
748 /* Special case TImode to 128-bit vector conversions via V2DI. */
749 if (VECTOR_MODE_P (mode)
75
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116274
--- Comment #6 from Hongtao Liu ---
(In reply to Hongtao Liu from comment #5)
> For non-avx case, looks like it hits here
>
> 748 /* Special case TImode to 128-bit vector conversions via V2DI. */
>
Prevent that in reload, we get
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113729
Hongtao Liu changed:
What|Removed |Added
Keywords||missed-optimization
Resolution|--
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116174
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115756
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115749
Bug 115749 depends on bug 115756, which changed state.
Bug 115756 Summary: default tuning for x86_64 produces shifts for `*240`
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115756
What|Removed |Added
--
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115749
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600
--- Comment #10 from Hongtao Liu ---
I think it should be fixed by r15-2820-gab18785840d7b8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81602
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116274
--- Comment #8 from Hongtao Liu ---
>
> codegen is probably an RA/LRA artifact caused by bad instruction constraints
> and the refuse to reload to a gpr. Not sure if a move high to gpr is a
> thing,
> pextrq would work for sure. But an unpck
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116174
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115982
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Known to fail|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115683
--- Comment #6 from Hongtao Liu ---
(In reply to Uroš Bizjak from comment #5)
> (In reply to Hongtao Liu from comment #0)
>
> > g++: g++.target/i386/pr100637-1b.C
> > g++: g++.target/i386/pr100637-1w.C
> > g++: g++.target/i386/pr103861-1.C
> >
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116497
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116512
Hongtao Liu changed:
What|Removed |Added
Last reconfirmed|2024-08-28 00:00:00 |
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116512
--- Comment #4 from Hongtao Liu ---
gdb shows crtl->return_rtx is
21(parallel/i:BLK [
22(expr_list:REG_DEP_TRUE (reg:XI 20 xmm0)
23(c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116512
--- Comment #6 from Hongtao Liu ---
(In reply to Andrew Pinski from comment #5)
> (In reply to Hongtao Liu from comment #4)
> > gdb shows crtl->return_rtx is
> >
> > 21(parallel/i:BLK [
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116512
Hongtao Liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116582
--- Comment #5 from Hongtao Liu ---
(In reply to Richard Biener from comment #4)
> (In reply to Jan Hubicka from comment #3)
> > Just for completeness the codegen for parest sparse matrix multiply is:
> >
> > 0.31 │320: kmovb %k1,%k
,
||liuhongt at gcc dot gnu.org
--- Comment #2 from Hongtao Liu ---
@Haochen Could you add that.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116617
Hongtao Liu changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: liuhongt at gcc dot gnu.org
Target Milestone: ---
Created attachment 59082
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59082&action=edit
test111.i
g++ -O3 te
Keywords: ice-on-valid-code
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: liuhongt at gcc dot gnu.org
CC: rguenth at gcc dot gnu.org
Target Milestone: ---
Created
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116674
--- Comment #1 from Hongtao Liu ---
Created attachment 59094
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59094&action=edit
test.i
A more reduced case.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116675
Hongtao Liu changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114544
--- Comment #3 from Hongtao Liu ---
<__umodti3>:
...
37 58: 66 48 0f 6e c7 movq %rdi,%xmm0
38 5d: 66 48 0f 6e d6 movq %rsi,%xmm2
39 62: 66 0f 6c c2 punpcklqdq %xmm2,%xmm0
40 66:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113288
Hongtao Liu changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
1 - 100 of 551 matches
Mail list logo