On 05/16/2014 12:05 PM, Kugan wrote:
> 
> 
> On 16/05/14 20:40, pins...@gmail.com wrote:
>>
>>
>>> On May 16, 2014, at 3:23 AM, Kugan <kugan.vivekanandara...@linaro.org> 
>>> wrote:
>>>
>>> I would like to know if there is anyway we can use registers from
>>> particular register class just as spill registers (in places where
>>> register allocator would normally spill to stack and nothing more), when
>>> it can be useful.
>>>
>>> In AArch64, in some cases, compiling with -mgeneral-regs-only produces
>>> better performance compared not using it. The difference here is that
>>> when -mgeneral-regs-only is not used, floating point register are also
>>> used in register allocation. Then IRA/LRA has to move them to core
>>> registers before performing operations as shown below.
>>
>> Can you show the code with fp register disabled?  Does it use the stack to 
>> spill?  Normally this is due to register to register class costs compared to 
>> register to memory move cost.  Also I think it depends on the processor 
>> rather the target.  For thunder, using the fp registers might actually be 
>> better than using the stack depending if the stack was in L1. 
> Not all the LDR/STR combination match to fmov. In the testcase I have,
> 
> aarch64-none-linux-gnu-gcc sha_dgst.c -O2  -S  -mgeneral-regs-only
> grep -c "ldr" sha_dgst.s
> 50
> grep -c "str" sha_dgst.s
> 42
> grep -c "fmov" sha_dgst.s
> 0
> 
> aarch64-none-linux-gnu-gcc sha_dgst.c -O2  -S
> grep -c "ldr" sha_dgst.s
> 42
> grep -c "str" sha_dgst.s
> 31
> grep -c "fmov" sha_dgst.s
> 105
> 
> I  am not saying that we shouldn’t use floating point register here. But
> from the above, it seems like register allocator is using it as more
> like core register (even though the cost mode has higher cost) and then
> moving the values to core registers before operations. if that is the
> case, my question is, how do we just make this as spill register class
> so that we will replace ldr/str with equal number of fmov when it is
> possible.

I'm also seeing stuff like this:

=> 0x7fb72a0928 <ClassFileParser::parse_constant_pool_entries(int, 
Thread*)+2500>:      
    add x21, x4, x21, lsl #3
=> 0x7fb72a092c <ClassFileParser::parse_constant_pool_entries(int, 
Thread*)+2504>:      
    fmov        w2, s8
=> 0x7fb72a0930 <ClassFileParser::parse_constant_pool_entries(int, 
Thread*)+2508>:      
    str w2, [x21,#88]

I guess GCC doesn't know how to store an SImode value in an FP register into
memory?  This is  4.8.1.

Andrew.

Reply via email to