https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103252
--- Comment #14 from Hongtao.liu ---
(In reply to Hongtao.liu from comment #13)
> >
> > So for short live range reg, we may lose opportunity to allocate best
> > regclass, maybe add peephole2 to handle those cases instead of tune RA.
> No, r132
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103252
--- Comment #13 from Hongtao.liu ---
>
> So for short live range reg, we may lose opportunity to allocate best
> regclass, maybe add peephole2 to handle those cases instead of tune RA.
No, r132 is also used as addr, but currently lra only add
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103252
--- Comment #12 from Hongtao.liu ---
(In reply to Jason A. Donenfeld from comment #9)
> > When the mask registers are available for use, RA considers them and when
> > spilling to those is cheaper than to memory, it spills to them and not
> >
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103252
--- Comment #11 from Hongtao.liu ---
> Why cprop_hardreg can't handle this?
cprop_hardreg only prop hard register, not memory.
(insn 86 85 227 15 (set (reg:SI 68 k0 [132])
(mem/u/c:SI (plus:SI (reg:SI 3 bx [82])
(const:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103252
--- Comment #10 from Hongtao.liu ---
(In reply to Andrew Pinski from comment #4)
> (In reply to Jason A. Donenfeld from comment #2)
> > Here's a more minimal test case: https://gcc.godbolt.org/z/15hnsb6of
>
> kmovd k0, ecx
> m
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103252
--- Comment #9 from Jason A. Donenfeld ---
> When the mask registers are available for use, RA considers them and when
> spilling to those is cheaper than to memory, it spills to them and not memory.
Yes, this is the thing I don't get. When y
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103252
--- Comment #8 from Jakub Jelinek ---
Definitely not lazier. When the mask registers are available for use, RA
considers them and when spilling to those is cheaper than to memory, it spills
to them and not memory. Where cheaper is determined b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103252
--- Comment #7 from Jason A. Donenfeld ---
The strange thing in this case is that the non-avx512 codegen _doesn't_ spill
to memory. It just uses the gprs that are around. So it seems like that,
somehow, the mere existence of the mask registers c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103252
Richard Biener changed:
What|Removed |Added
Ever confirmed|0 |1
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103252
--- Comment #6 from Hongtao.liu ---
> For the original example, though, it doesn't seem to even be saving a spill.
> The non-k0 code is clearly better than the k0 code. I don't know much about
> how the allocator works and interacts with various
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103252
--- Comment #5 from Jason A. Donenfeld ---
> This one is fine/ok as GCC is using k0 as a spill register rather than
> spilling to memory. 32bit x86 has limited registers and all. There is nothing
> odd about this one even.
Right, okay, I see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103252
--- Comment #4 from Andrew Pinski ---
(In reply to Jason A. Donenfeld from comment #2)
> Here's a more minimal test case: https://gcc.godbolt.org/z/15hnsb6of
kmovd k0, ecx
mov ecx, DWORD PTR __libc_tsd_CTYPE_B@gotntpoff[eb
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103252
--- Comment #3 from Jakub Jelinek ---
On https://gcc.godbolt.org/z/KG63ErzEr I don't see anything wrong, ia32 has
just a few GPRs and all of them are heavily used in the loop, so if the %k?
registers aren't slower than memory, it seems just fine
13 matches
Mail list logo