https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108938
Hongtao.liu changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108938
--- Comment #16 from CVS Commits ---
The master branch has been updated by hongtao Liu :
https://gcc.gnu.org/g:d8545fb2c71683f407bfd96706103297d4d6e27b
commit r14-1402-gd8545fb2c71683f407bfd96706103297d4d6e27b
Author: liuhongt
Date: Mon Mar
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108938
Bug 108938 depends on bug 108874, which changed state.
Bug 108874 Summary: [10/11/12/13 Regression] Missing bswap detection
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108874
What|Removed |Added
--
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108938
--- Comment #15 from Hongtao.liu ---
Created attachment 54612
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54612&action=edit
Patch pending for GCC14
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108938
--- Comment #14 from Hongtao.liu ---
Got 1 performance opportunity in GCC itself with bswap + bit_and + rotate, the
Intermediate value are all single-use which can be DCEd.
Got 4 performance opportunity in SPEC2017.
bswap + bit_and + rotate + s
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108938
--- Comment #13 from Hongtao.liu ---
(In reply to Jakub Jelinek from comment #12)
> (In reply to Hongtao.liu from comment #11)
> > (In reply to Jakub Jelinek from comment #9)
> > > Though, if more than one replacement operation is emitted, one n
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108938
--- Comment #12 from Jakub Jelinek ---
(In reply to Hongtao.liu from comment #11)
> (In reply to Jakub Jelinek from comment #9)
> > Though, if more than one replacement operation is emitted, one needs to be
> > careful not to emit more expensive
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108938
--- Comment #11 from Hongtao.liu ---
(In reply to Jakub Jelinek from comment #9)
> Though, if more than one replacement operation is emitted, one needs to be
> careful not to emit more expensive replacement than the original sequence
> (especial
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108938
--- Comment #10 from Richard Biener ---
One original idea was to leverage VEC_PERM as well but then at least on x86
a vec_perm can expand to many instructions so costing will be difficult
(and there's obviously cross register file movements)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108938
--- Comment #9 from Jakub Jelinek ---
Though, if more than one replacement operation is emitted, one needs to be
careful not to emit more expensive replacement than the original sequence
(especially if some subexpressions aren't single use).
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108938
--- Comment #8 from Hongtao.liu ---
(In reply to Jakub Jelinek from comment #6)
> Can be just shift. The bswap descriptions are zero byte means the byte is
> zero,
Yes, I'm also supporting byte permutation as 0x0801020004050607 which is bswap
+
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108938
--- Comment #7 from Hongtao.liu ---
> I'm cooking a patch, for shift cases, it looks like gimple doesn't simplify
> bit mask + rotate to shift. .i.e.
>
Create a separate PR109038 for it.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108938
--- Comment #6 from Jakub Jelinek ---
Can be just shift. The bswap descriptions are zero byte means the byte is
zero,
1-8 copy of some byte and 0xff unknown. So no need to mask anything, just look
at the
symbolic n.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108938
--- Comment #5 from Hongtao.liu ---
(In reply to Jakub Jelinek from comment #4)
> And perhaps next to rotate it could try some left or right (logical) shift
> too.
Yes, we have bit mask in bswap detection, left or right (logical) shift can be
r
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108938
--- Comment #4 from Jakub Jelinek ---
And perhaps next to rotate it could try some left or right (logical) shift too.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108938
--- Comment #3 from Jakub Jelinek ---
You're right. Then we should handle it more generically, basically check if
either the
CMPNOP or CMPXCHG patterns (well, their narrowed counterparts based on size)
would match if rotated by a multiple of 8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108938
--- Comment #2 from Hongtao.liu ---
> patterns it groks, so this would add for 32-bit only 0x03040102 (or, does it
> make sense
It can be even more flexible, .i.e 0x04010203, 0x02030401, rotate count can be
any multiple of 8.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108938
Jakub Jelinek changed:
What|Removed |Added
CC||jakub at gcc dot gnu.org
--- Comment #1
18 matches
Mail list logo