On 2020/4/3 06:16, Segher Boessenkool wrote:
> Hi!
>
> On Mon, Mar 30, 2020 at 11:59:57AM +0800, luoxhu wrote:
>>> Do we want something later in the RTL pipeline to make "addi"s etc. again?
>
> (This would be a good thing to consider -- maybe a define_insn_and_split
> will work. But see below).
>
>> [PATCH] rs6000: Don't split constant operator add before reload, move to
>> temp register for future optimization
>>
>> Don't split code from add<mode>3 for SDI to allow a later pass to split.
>> This allows later logic to hoist out constant load in add instructions.
>> In loop, lis+ori could be hoisted out to improve performance compared with
>> previous addis+addi (About 15% on typical case), weak point is
>> one more register is used and one more instruction is generated. i.e.:
>>
>> addis 3,3,0x8765
>> addi 3,3,0x4321
>>
>> =>
>>
>> lis 9,0x8765
>> ori 9,9,0x4321
>> add 3,3,9
>
> (This patch will of course have to wait for stage 1).
>
> Such a define_insn_and_split could be for an add of a (signed) 32-bit
> immediate. combine will try to combine the three insns (lis;ori;add),
> and match the new pattern.
Currently 286r.split2 will split "12:%9:DI=0x87654321" to lis+ori by
rs6000_emit_set_const of define_split, do you mean add new define_insn_and_split
to do the split? Another patch to do this after this one goes upstream in
stage 1?
>
> This also links in with Alan's work on big immediates, and with paddi
> insns, etc.
Seems PR94393? Yes, rs6000_emit_set_const calls rs6000_emit_set_long_const for
DImode.
I tried unsigned long like 0xabcd87654321, 0xffffabcd87654321 and
0xc000000000000000ULL,
All of them are outside of loop even without my patch. No difference with or
without
Alan's patch.
0xabcd87654321: li 9,0 ori 9,9,0xabcd sldi 9,9,32 oris 9,9,0x8765 ori
9,9,0x4321
0xffffabcd87654321: lis 9,0xabcd ori 9,9,0x8765 sldi 9,9,16 ori
9,9,0x4321
0xc000000000000000ULL: li 9,-1 rldicr 9,9,0,1
Thanks,
Xionghu
>
>
> Segher
>