Re: [AArch64][PR65139] use clobber with match_scratch for aarch64_lshr_sisd_or_int_3

Richard Earnshaw Tue, 21 Apr 2015 06:29:02 -0700

On 18/04/15 19:17, Maxim Kuvyrkov wrote:
>> On Apr 18, 2015, at 8:21 PM, Richard Earnshaw 
>> <[email protected]> wrote:
>>
>> On 18/04/15 16:13, Jakub Jelinek wrote:
>>> On Sat, Apr 18, 2015 at 03:07:16PM +0100, Richard Earnshaw wrote:
>>>> You need to ensure that your scratch register cannot overlap op1, since
>>>> the scratch is written before op1 is read.
>>>
>>> -   (clobber (match_scratch:QI 3 "=X,w,X"))]
>>> +   (clobber (match_scratch:QI 3 "=X,&w,X"))]
>>>
>>> incremental diff should ensure that, right?
>>>
>>>     Jakub
>>>
>>
>>
>> Sorry, where in the patch is that hunk?
>>
>> I see just:
>>
>> +   (clobber (match_scratch:QI 3 "=X,w,X"))]
> 
> Jakub's suggestion is an incremental patch on top of Kugan's.
>


Ah, sorry, I though he was implying it was already in the patch somewhere.

>>
>> And why would early clobbering the scratch be notably better than the
>> original?
>>
> 
> It will still be better.  With this patch we want to allow RA freedom to 
> optimally handle both of the following cases:
> 
> 1. operand[1] dies after the instruction.  In this case we want operand[0] 
> and operand[1] to be assigned to the same reg, and operand[3] to be assigned 
> to a different register to provide a temporary.  In this case we don't care 
> whether operand[3] is early-clobber or not.  This case is not optimally 
> handled with current insn patterns.
> 
> 2. operand[1] lives on after the instruction.  In this case we want 
> operand[0] and operand[3] to be assigned to the same reg, and not clobber 
> operand[1].  By marking operand[3] early-clobber we ensure that operand[1] is 
> in a different register from what operand[0] and operand[3] were assigned to. 
>  This case should be handled equally well before and after the patch.
> 
> My understanding is that Kugan's patch with Jakub's fix on top satisfy both 
> of these cases.
>  

I still don't think it handles all cases efficiently.  If we really want
the result in a different register from both of the inputs, then now we
need two registers for the results, one for the result and another for
the temporary.  In that case we could have used the result register as
the scratch, but now we can't.

Maybe we can provide two alternatives, one that early-clobbers the
result register but doesn't need a scratch and one that doesn't
early-clobber the result, but does need a scratch.

So something like

(define_insn "aarch64_lshr_sisd_or_int_<mode>3"
  [(set (match_operand:GPI 0 "register_operand" "=w,&w,w,r")
         (lshiftrt:GPI
           (match_operand:GPI 1 "register_operand" "w,w,w,r")
           (match_operand:QI 2 "aarch64_reg_or_shift_imm_<mode>"
                              "Us<cmode>,w,w,rUs<cmode>")))
   (clobber (match_scratch:QI 3 "=X,X,w,X"))]

... but I haven't tested any of that.

I would also note the conversation in
https://gcc.gnu.org/ml/gcc/2015-04/msg00240.html.  That seems to suggest
we should be wary of using scratch sequences since the register
allocator doesn't account for them properly.

R.

> --
> Maxim Kuvyrkov
> www.linaro.org
>

Re: [AArch64][PR65139] use clobber with match_scratch for aarch64_lshr_sisd_or_int_3

Reply via email to