On 9/6/19 12:47 PM, Richard Earnshaw (lists) wrote: > On 06/09/2019 11:28, Bernd Edlinger wrote: >> Hi! >> >> It was pointed out in the PR that the test case fails on big endian >> targets. >> >> This is probably obvious, since the test case scans the assembler >> output for vld1.32, vst1.32, vldr and vstr, but these are never >> generated for -mbig-endian. >> >> So added dg-require-effective-target arm_little_endian. >> >> >> Is it OK for trunk? >> >> >> Thanks >> Bernd. >> > > I think this test needs further work - there's opportunity for the compiler > to optimize it by placement of both a and x, since it can see both > definitions. Note that vldr is safe on 32-bit (or higher) aligned objects, > so if GCC can see the definition it can use that information (the actual > rather than the declared placement) in its optimizations. > > Changing the global to extern, or making it a pointer and dereferencing that > is probably enough. >
done. > Finally, the test is really just a memory copy operation. There's no real > reason why it has to use the vector registers for this at all. I think if > you want to force them, you'll need to do some operations on the values as > well so that the compiler has to place the values in the vector registers > rather than anywhere it sees fit. > Okay, nailed the test case down, to use neon registers. Is it OK now? Thanks Bernd.
2019-09-06 Bernd Edlinger <bernd.edlin...@hotmail.de> * gcc.target/arm/pr91603.c: Add dg-require-effective-target arm_little_endian. Force value in neon register. Index: gcc/testsuite/gcc.target/arm/pr91603.c =================================================================== --- gcc/testsuite/gcc.target/arm/pr91603.c (revision 275409) +++ gcc/testsuite/gcc.target/arm/pr91603.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target arm_neon_ok } */ +/* { dg-require-effective-target arm_little_endian } */ /* { dg-options "-O3" } */ /* { dg-add-options arm_neon } */ @@ -6,7 +7,7 @@ typedef __simd64_int32_t int32x2_t; typedef __attribute__((aligned (1))) int32x2_t unalignedvec; -unalignedvec a = {11, 13}; +extern unalignedvec a; void foo(unalignedvec *); @@ -13,7 +14,9 @@ void foo(unalignedvec *); void test() { unalignedvec x = a; + __asm__ ("@ value in %h0": "=w"(x) : "0"(x)); foo (&x); + __asm__ ("@ value in %h0": "=w"(x) : "0"(x)); a = x; }