On 9/5/19 11:21 AM, Richard Earnshaw (lists) wrote:
> On 04/09/2019 16:48, Richard Earnshaw (lists) wrote:
>> On 04/09/2019 16:00, Bernd Edlinger wrote:
>>> On 9/4/19 4:14 PM, Richard Earnshaw (lists) wrote:
>>>> On 04/09/2019 14:28, Bernd Edlinger wrote:
>>>>> On 9/4/19 2:53 PM, Richard Earnshaw (lists) wrote:
>>>>>> Index: gcc/testsuite/gcc.target/arm/unaligned-argument-2.c
>>>>>> ===================================================================
>>>>>> --- gcc/testsuite/gcc.target/arm/unaligned-argument-2.c    (Revision 0)
>>>>>> +++ gcc/testsuite/gcc.target/arm/unaligned-argument-2.c    (Arbeitskopie)
>>>>>> @@ -0,0 +1,19 @@
>>>>>> +/* { dg-do compile } */
>>>>>> +/* { dg-require-effective-target arm_arm_ok } */
>>>>>> +/* { dg-require-effective-target arm_ldrd_strd_ok } */
>>>>>> +/* { dg-options "-marm -mno-unaligned-access -O3" } */
>>>>>> +
>>>>>> +struct s {
>>>>>> +  int a, b;
>>>>>> +} __attribute__((aligned(8)));
>>>>>> +
>>>>>> +struct s f0;
>>>>>> +
>>>>>> +void f(int a, int b, int c, int d, int e, struct s f)
>>>>>> +{
>>>>>> +  f0 = f;
>>>>>> +}
>>>>>> +
>>>>>> +/* { dg-final { scan-assembler-times "ldrd" 0 } } */
>>>>>> +/* { dg-final { scan-assembler-times "strd" 0 } } */
>>>>>> +/* { dg-final { scan-assembler-times "stm" 1 } } */
>>>>>>
>>>>>> I don't think this test is right.  While we can't use an LDRD to load 
>>>>>> the argument off the stack, there's nothing wrong with using an STRD to 
>>>>>> then store the value to f0 (as that is 8-byte aligned).  So the second 
>>>>>> and third scan-assembler tests are meaningless.
>>>>>>
>>>>>
>>>>> Ah, that is very similar to the unaligned-memcpy-2/3.c,
>>>>> see https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00157.html
>>>>>
>>>>> initially that is a movdi,
>>>>> then in subreg1 it is split in two movsi
>>>>> which is then re-assembled as ldm
>>>>>
>>>>>
>>>>> Not sure if that is intended in that way.
>>>>>
>>>>>
>>>>
>>>> Yeah, these are causing me some problems too, but that's because with some 
>>>> changes I'm working on I now see the compiler using r4 and r5, which leads 
>>>> to prologue and epilogue stores that distort the results.
>>>>
>>>> Tests like this are generally fragile - I hate 'em!!!!
>>>>
>>>
>>> Yeah, that changed since r275063 introduced the unaligned-load/storedi
>>>
>>> r275063 | edlinger | 2019-08-30 12:38:37 +0200 (Fr, 30. Aug 2019) | 10 
>>> Zeilen
>>> Geänderte Pfade:
>>>     M /trunk/gcc/ChangeLog
>>>     M /trunk/gcc/config/arm/arm.c
>>>     M /trunk/gcc/config/arm/arm.md
>>>     M /trunk/gcc/config/arm/neon.md
>>>
>>> 2019-08-30  Bernd Edlinger  <bernd.edlin...@hotmail.de>
>>>
>>>          * config/arm/arm.md (unaligned_loaddi,
>>>          unaligned_storedi): New unspec insn patterns.
>>>          * config/arm/neon.md (unaligned_storev8qi): Likewise.
>>>          * config/arm/arm.c (gen_cpymem_ldrd_strd): Use unaligned_loaddi
>>>          and unaligned_storedi for 4-byte aligned memory.
>>>          (arm_block_set_aligned_vect): Use unaligned_storev8qi for
>>>          4-byte aligned memory.
>>>
>>> Since other than the movdi they are not split up but stay as ldrd/strd.
>>> But for some unknown reason ira assigns r4-5 to those although also
>>> r1-2 would be available. :-(
>>>
>>
>> r1-r2 can't be used in Arm state as the register has to start on an even 
>> boundary.  But ira has already used r3 for the address of the store (it 
>> could have picked r1) and now r4-r5 is the next even-numbered pair.  So we 
>> end up with needing to save some call-clobbered regs.
>>
>> R.
>>>
>>> Bernd.
>>>
>>
> 
> One possible trick to stabilize the test is to insert an asm that clobbers r4 
> and r5 and forces the prologue/epilogue code to always save and restore them. 
>  Then we can account for those prologue/epilogue consistently (at least, 
> modulo the arm_prefer_ldrd_strd condition).
> 

Yes, or add -fno-omit-frame-pointer.

BTW: have you seen this negative lookahead in my patch
[PATCH] [ARM] Adjust test expectations of unaligned-memcpy-2/3.c (PR 91614)
https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00157.html

/* { dg-final { scan-assembler-times "ldrd\(?!\[^\\n\]*sp\)" 0 } } */

it makes the test work for all possible combinations with
RUNTESTFLAGS="--target_board=unix\{-mcpu=cortex-a15,-mcpu=cortex-a57,-mcpu=cortex-a9,-mcpu=cortex-a8,-mcpu=cortex-a7\}\{-fno-omit-frame-pointer,\}"


Cool isn't it?

Bernd.

Reply via email to