On 03/09/2019 21:45, Bernd Edlinger wrote:
Hi,

due to the introduction of unaligned_loaddi and unaligned_storedi,
two test cases show some regression as PR 91614 points out.

I would like to change the test expectations if these two test cases,
since they seem to be bogus.

That is the test case already failed for arm_prefer_ldrd_strd targets,
since not a ldrd or strd is created but a ldm/stm instead, which
is probably not what is intended.

That is for arm_prefer_ldrd_strd the previously used instruction movdi:

(insn 11 10 12 2 (set (mem/c:DI (reg:SI 112) [0 MEM <charD.4[1:15]> [(voidD.71 
*)&destD.5932]+0 S8 A32])
         (reg:DI 114)) "unaligned-memcpy-2.c":11:3 642 {*movdi_vfp}
      (nil))

is split up very early in subreg1 into:

(insn 21 10 22 2 (set (mem/c:SI (reg:SI 112) [0 MEM <charD.4[1:15]> [(voidD.71 
*)&destD.5932]+0 S4 A32])
         (reg:SI 119)) "unaligned-memcpy-2.c":11:3 640 {*arm_movsi_vfp}
      (nil))
(insn 22 21 12 2 (set (mem/c:SI (plus:SI (reg:SI 112)
                 (const_int 4 [0x4])) [0 MEM <charD.4[1:15]> [(voidD.71 
*)&destD.5932]+4 S4 A32])
         (reg:SI 120 [+4 ])) "unaligned-memcpy-2.c":11:3 640 {*arm_movsi_vfp}
      (nil))

which is finally replaced by:

(insn 35 10 12 2 (parallel [
             (set (mem/c:SI (reg/f:SI 3 r3 [111]) [0 MEM <charD.4[1:15]> [(voidD.71 
*)&destD.5932]+0 S4 A32])
                 (reg:SI 1 r1))
             (set (mem/c:SI (plus:SI (reg/f:SI 3 r3 [111])
                         (const_int 4 [0x4])) [0 MEM <charD.4[1:15]> [(voidD.71 
*)&destD.5932]+4 S4 A32])
                 (reg:SI 2 r2))
         ]) "unaligned-memcpy-2.c":11:3 378 {*stm2_}
      (expr_list:REG_DEAD (reg:SI 2 r2)
         (expr_list:REG_DEAD (reg:SI 1 r1)
             (nil))))

... so stm instead of strd.

Since the unalinged_storedi is an unspec, it is expanded as strd which is
kind of okay, but there is register pair spilled which was not there previously:

aligned_dest:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        strd    r4, [sp, #-8]!
        ldr     r4, [r0]        @ unaligned
        movw    r3, #:lower16:.LANCHOR0
        ldr     r5, [r0, #4]    @ unaligned
        movt    r3, #:upper16:.LANCHOR0
        strd    r4, [r3]
        ldr     r2, [r0, #8]    @ unaligned
        str     r2, [r3, #8]
        ldrh    r2, [r0, #12]   @ unaligned
        strh    r2, [r3, #12]   @ movhi
        ldrb    r2, [r0, #14]   @ zero_extendqisi2
        strb    r2, [r3, #14]
        ldrd    r4, [sp]
        add     sp, sp, #8
        bx      lr

The patch filters out ldrd/strd that sp-relative, and adds a few
missing scan-assembler rules, in order to make the test results more stable.

Alternative could be to use two movsi instead of the unaligned_load/storedi,
but that would end up in ldm/stm which looks plain wrong to me.


Tested using various arm-linux-gnueabihf cross-compilers.
Is it OK for trunk?


Thanks
Bernd.

2019-09-03  Bernd Edlinger  <bernd.edlin...@hotmail.de>

        PR target/91614
        * gcc.target/arm/unaligned-memcpy-2.c: Adjust test expectations.
        * gcc.target/arm/unaligned-memcpy-3.c: Likewise.

Index: gcc/testsuite/gcc.target/arm/unaligned-memcpy-2.c
===================================================================
--- gcc/testsuite/gcc.target/arm/unaligned-memcpy-2.c   (Revision 275341)
+++ gcc/testsuite/gcc.target/arm/unaligned-memcpy-2.c   (Arbeitskopie)
@@ -15,9 +15,11 @@
    loads/stores for the remainder.  */

 /* { dg-final { scan-assembler-times "ldmia" 0 } } */
-/* { dg-final { scan-assembler-times "ldrd" 0 } } */
+/* { dg-final { scan-assembler-times "ldrd\(?!\[^\\n\]*sp\)" 0 } } */
+/* { dg-final { scan-assembler-times "ldm\(?!ia\)" 0 } } */
/* { dg-final { scan-assembler-times "stmia" 1 { target { ! { arm_prefer_ldrd_strd } } } } } */ -/* { dg-final { scan-assembler-times "strd" 1 { target { arm_prefer_ldrd_strd } } } } */ +/* { dg-final { scan-assembler-times "strd\(?!\[^\\n\]*sp\)" 1 { target { arm_prefer_ldrd_strd } } } } */
+/* { dg-final { scan-assembler-times "stm\(?!ia\)" 0 } } */
 /* { dg-final { scan-assembler-times "ldrh" 1 } } */
 /* { dg-final { scan-assembler-times "strh" 1 } } */
 /* { dg-final { scan-assembler-times "ldrb" 1 } } */
Index: gcc/testsuite/gcc.target/arm/unaligned-memcpy-3.c
===================================================================
--- gcc/testsuite/gcc.target/arm/unaligned-memcpy-3.c   (Revision 275341)
+++ gcc/testsuite/gcc.target/arm/unaligned-memcpy-3.c   (Arbeitskopie)
@@ -15,10 +15,12 @@
    loads/stores for the remainder.  */

/* { dg-final { scan-assembler-times "ldmia" 1 { target { ! { arm_prefer_ldrd_strd } } } } } */ -/* { dg-final { scan-assembler-times "ldrd" 1 { target { arm_prefer_ldrd_strd } } } } */
-/* { dg-final { scan-assembler-times "strd" 0 } } */
-/* { dg-final { scan-assembler-times "stm" 0 } } */
-/* { dg-final { scan-assembler-times "ldrh" 1 { target { ! { arm_prefer_ldrd_strd } } } } } */ +/* { dg-final { scan-assembler-times "ldrd\(?!\[^\\n\]*sp\)" 1 { target { arm_prefer_ldrd_strd } } } } */
+/* { dg-final { scan-assembler-times "ldm\(?!ia\)" 0 } } */
+/* { dg-final { scan-assembler-times "stmia" 0 } } */
+/* { dg-final { scan-assembler-times "strd\(?!\[^\\n\]*sp\)" 0 } } */
+/* { dg-final { scan-assembler-times "stm\(?!ia\)" 0 } } */
+/* { dg-final { scan-assembler-times "ldrh" 1 } } */
 /* { dg-final { scan-assembler-times "strh" 1 } } */
-/* { dg-final { scan-assembler-times "ldrb" 1 { target { ! { arm_prefer_ldrd_strd } } } } } */
+/* { dg-final { scan-assembler-times "ldrb" 1 } } */
 /* { dg-final { scan-assembler-times "strb" 1 } } */

Ug!  That's horrible!

Also, it's testing things we really shouldn't be concerned about. For example, the ldrh+ldrb+strh+strb is probably better done as ldr [offset-1] + str [offset-1] to handle the tail (ie re-copying the byte 4th from the end). When someone does that, this test will then needlessly fail.

I'm not really convinced that this sort of thing can be properly tested with scan-assembler tests, there's just too much that might change. Perhaps we should just accept this and create a simple executable test that will fault if run on a target that doesn't support the code we generate...

R.

Reply via email to