csinv opportunities on immediates

pinskia Fri, 17 Jul 2015 12:01:11 -0700


> On Jul 17, 2015, at 9:58 PM, Kyrill Tkachov <kyrylo.tkac...@arm.com> wrote:
> 
> 
>> On 10/07/15 14:45, Kyrill Tkachov wrote:
>>> On 10/07/15 10:00, pins...@gmail.com wrote:
>>> 
>>> 
>>>> On Jul 10, 2015, at 1:47 AM, Kyrill Tkachov <kyrylo.tkac...@arm.com> wrote:
>>>> 
>>>> Hi Andrew,
>>>> 
>>>>> On 10/07/15 09:40, pins...@gmail.com wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Jul 10, 2015, at 1:34 AM, Kyrill Tkachov <kyrylo.tkac...@arm.com> 
>>>>>> wrote:
>>>>>> 
>>>>>> Hi all,
>>>>>> 
>>>>>> Currently when evaluating expressions like (a ? 24 : 25) we will move 24 
>>>>>> and 25 into
>>>>>> registers and perform a csel on them.  This misses the opportunity to 
>>>>>> instead move just 24
>>>>>> into a register and then perform a csinc, saving us an instruction and a 
>>>>>> register use.
>>>>>> Similarly for csneg and csinv.
>>>>>> 
>>>>>> This patch implements that idea by allowing such pairs of immediates in 
>>>>>> *cmov<mode>_insn
>>>>>> and adding an early splitter that performs the necessary transformation.
>>>>>> 
>>>>>> The testcase included in the patch demonstrates the kind of 
>>>>>> opportunities that are now picked up.
>>>>>> 
>>>>>> With this patch I see about 9.6% more csinc instructions being generated 
>>>>>> for SPEC2006
>>>>>> and the generated code looks objectively better (i.e. fewer 
>>>>>> mov-immediates and slightly
>>>>>> lower register pressure).
>>>>>> 
>>>>>> Bootstrapped and tested on aarch64.
>>>>>> 
>>>>>> Ok for trunk?
>>>>> I think this is the wrong place for this optimization. It should happen 
>>>>> in expr.c and we should produce cond_expr on the gimple level.
>>>> I had considered it, but I wasn't sure how general the conditional 
>>>> increment/negate/inverse operations
>>>> are to warrant a midend implementation. Do you mean the 
>>>> expand_cond_expr_using_cmove function in expr.c?
>>> Yes and we can expand it to even have a target hook on how to expand them 
>>> if needed.
>> I played around in that part and it seems that by the time it gets to 
>> expansion the midend
>> doesn't have a cond_expr of the two immediates, it's a PHI node with the 
>> immediates already expanded.
>> I have not been able to get it to match a cond_expr of two immediates there, 
>> although that could be
>> because I'm unfamiliar with that part of the codebase.
> 
> So by the time we reach expansion time we don't have a COND_EXPR of two 
> immediates, so I tried getting
> the code in expr.c to do the right thing, but it didn't work out.
> This patch catches this opportunity at the RTL level and could catch such 
> cases if they were to be
> generated by any of the pre-combine RTL passes. Or do you reckon looking for 
> these patterns in RTL
> ifcvt is the way to go? I think it would be somewhat messy to express the 
> CSNEG, CSINV opportunities
> there as we don't have optabs for conditional negate and invert, but 
> conditional increment would work,
> though in the aarch64 case we can only do a conditional by 1 rather than a 
> general conditional add.

Right as I said, I have a patch to phiopt to produce the cond_expr when it is 
useful. That is create cond_expr before we even get to rtl. 

Thanks,
Andrew


> 
> Kyrill
> 
> 
>> 
>> Kyrill
>> 
>>> There is already a standard pattern for condition add so the a ? Const1 : 
>>> const2 can be handled in the a generic way without much troubles. We should 
>>> handle it better in rtl  ifcvt too (that should be an easier patch). The 
>>> neg and not cases are very target specific but can be handled by a target 
>>> hook and expand it directly to it.
>>> 
>>> 
>>>>>   I have patches to do both but I have not got around to cleaning them 
>>>>> up. If anyone wants them, I can send a link to my current gcc 5.1 sources 
>>>>> with them included.
>>>> Any chance you can post them on gcc-patches even as a rough idea of what 
>>>> needs to be done?
>>> I posted my expr patch a few years ago but I never got around to rth's 
>>> comments. This was the generic increment patch. Basically aarch64 should be 
>>> implementing that pattern too.
>>> 
>>> 
>>> The main reason why this should be handled in gimple is that ifcvt on the 
>>> rtl level is not cheap and does not catch all of the cases the simple 
>>> expansion of phi-opt does. I can dig that patch up and I will be doing that 
>>> next week anyways.
>>> 
>>> Thanks,
>>> Andrew
>>> 
>>>> Thanks,
>>>> Kyrill
>>>> 
>>>>>   Thanks,
>>>>> Andrew
>>>>> 
>>>>>> Thanks,
>>>>>> Kyrill
>>>>>> 
>>>>>> 2015-07-10  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>
>>>>>> 
>>>>>>     * config/aarch64/aarch64.md (*cmov<mode>_insn): Move stricter
>>>>>>     check for operands 3 and 4 to pattern predicate.  Allow immediates
>>>>>>     that can be expressed as csinc/csneg/csinv.  New define_split.
>>>>>>     (*csinv3<mode>_insn): Rename to...
>>>>>>     (csinv3<mode>_insn): ... This.
>>>>>>     * config/aarch64/aarch64.h (AARCH64_IMMS_OK_FOR_CSNEG): New macro.
>>>>>>     (AARCH64_IMMS_OK_FOR_CSINC): Likewise.
>>>>>>     (AARCH64_IMMS_OK_FOR_CSINV): Likewise.
>>>>>>     * config/aarch64/aarch64.c (aarch64_imms_ok_for_cond_op_1):
>>>>>>     New function.
>>>>>>     (aarch64_imms_ok_for_cond_op): Likewise.
>>>>>>     * config/aarch64/aarch64-protos.h (aarch64_imms_ok_for_cond_op_1):
>>>>>>     Declare prototype.
>>>>>>     (aarch64_imms_ok_for_cond_op): Likewise.
>>>>>> 
>>>>>> 2015-07-10  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>
>>>>>> 
>>>>>>     * gcc.target/aarch64/cond-op-imm_1.c: New test.
>>>>>> <aarch64-csinc-imms.patch>
>
Re: [PATCH][AArch64] Improve csinc/csneg/csinv opportunities on immediates

Reply via email to