On Wed, 2024-07-31 at 16:57 +0800, Lulu Cheng wrote:
>
> 在 2024/7/29 下午3:58, Xi Ruoyao 写道:
> > Per a gcc-help thread we are generating sub-optimal code for
> > __builtin_bswap{32,64}. To fix it:
> >
> > - Use a single revb.d instruction for bswapdi2.
> > - Use a single revb.2w instruction for bswapsi2 for TARGET_64BIT,
> > revb.2h + rotri.w for !TARGET_64BIT.
> > - Use a single revb.2h instruction for bswapsi2 (x) r>> 16, and a single
> > revb.2w instruction for bswapdi2 (x) r>> 32.
> >
> > Unfortunately I cannot figure out a way to make the compiler generate
> > revb.4h or revh.{2w,d} instructions.
>
> This optimization is really ingenious and I have no problem.
>
> I also haven't figured out how to generate revb.4h or revh. {2w,d}.
> I think we can merge this patch first.
Pushed r15-2433.
FWIW I tried a naive pattern for revh.2w:
(set (match_operand:DI 0 "register_operand" "=r")
(ior:DI
(and:DI
(ashift:DI (match_operand:DI 1 "register_operand" "r")
(const_int 16))
(const_int 18446462603027742720))
(and:DI
(lshiftrt:DI (match_dup 1)
(const_int 16))
(const_int 281470681808895))))
But it seems too complex to be recognized.
--
Xi Ruoyao <[email protected]>
School of Aerospace Science and Technology, Xidian University