On 04/07/2024 13:50, Siarhei Volkau wrote:
> чт, 4 июл. 2024 г. в 12:45, Richard Earnshaw (lists) 
> <richard.earns...@arm.com>:
>>
>> On 20/06/2024 08:24, Siarhei Volkau wrote:
>>> If the address register is dead after load/store operation it looks
>>> beneficial to use LDMIA/STMIA instead of pair of LDR/STR instructions,
>>> at least if optimizing for size.
>>>
>>> Changes v2 -> v3:
>>>  - switching to mixed approach (insn+peep2)
>>>  - keep memory attributes in peepholes
>>>  - handle stmia corner cases
>>>
>>> Changes v1 -> v2:
>>>  - switching to peephole2 approach
>>>  - added test case
>>>
>>> gcc/ChangeLog:
>>>
>>>         * config/arm/arm.cc (thumb_load_double_from_address): Emit ldmia
>>>         when address reg rewritten by load.
>>>
>>>         * config/arm/thumb1.md (peephole2 to rewrite DI/DF load): New.
>>>         (peephole2 to rewrite DI/DF store): New.
>>>
>>>       * config/arm/iterators.md (DIDF): New.
>>>
>>> gcc/testsuite:
>>>
>>>         * gcc.target/arm/thumb1-load-store-64bit.c: Add new test.
>>
>> I made a couple of cleanups and pushed this.  My testing of the cleanup also 
>> identified another corner case for the ldm instruciton: if the result of the 
>> load is not used (but it can't be eliminated because the address is marked 
>> volatile), then we could end up with
>>         ldm r0!, {r0, r1}
>> Which of course is unpredictable.  So we need to test not only that r0 is 
>> dead but that it isn't written by the load either.
>>
> 
> Good catch.
> 
> Regarding thumb2, I investigated it a bit and found that it has little effort:
> 1. DI mode splitted into two insns during subreg1 pass, so it won't be
> easy to catch as it was for t1.

We need to enable the new pair fusion pass for thumb2 to address this.  But it 
will mostly only result in new ldrd/strd instructions I suspect.

> 2. DF on soft-float t2 targets should have its own rules to transform
> into LDM/STM.
> 3. LDM/STM are slower than LDRD/STRD on dual-issue cores. so,
> profitable only for -Os.

Speed tends to be micro-architecture (implementation) specific, but yes, it may 
well be the case that this will only be a win at -Os.

R.

Reply via email to