Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> writes: > On Thu, 6 Jun 2019 at 16:54, Richard Sandiford > <richard.sandif...@arm.com> wrote: >> >> Szabolcs Nagy <szabolcs.n...@arm.com> writes: >> > On 03/06/2019 08:26, Prathamesh Kulkarni wrote: >> >> +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_8.c >> >> @@ -0,0 +1,32 @@ >> >> +/* { dg-do assemble { target aarch64_asm_sve_ok } } */ >> >> +/* { dg-options "-O2 -fno-schedule-insns -msve-vector-bits=256 >> >> --save-temps" } */ >> >> + >> >> +/* Case 5.2: Interleaved elements and constants. */ >> >> + >> >> +#include <stdint.h> >> >> + >> >> +typedef int32_t vnx4si __attribute__((vector_size (32))); >> >> + >> >> +__attribute__((noipa)) >> >> +vnx4si foo(int a, int b, int c, int d) >> >> +{ >> >> + return (vnx4si) { a, 1, b, 2, c, 3, d, 4 }; >> >> +} >> >> + >> >> +/* >> >> +foo: >> >> +.LFB0: >> >> + .cfi_startproc >> >> + ptrue p0.s, vl8 >> >> + mov z0.s, w3 >> >> + adrp x3, .LANCHOR0 >> >> + insr z0.s, w2 >> >> + add x3, x3, :lo12:.LANCHOR0 >> >> + insr z0.s, w1 >> >> + ld1w z1.s, p0/z, [x3] >> >> + insr z0.s, w0 >> >> + zip1 z0.s, z0.s, z1.s >> >> + ret >> >> +*/ >> >> + >> >> +/* { dg-final { scan-assembler {\tmov\t(z[0-9]+\.s), >> >> w3\n\tadrp\t(x[0-9]+), \.LANCHOR0\n\tinsr\t\1, w2\n\tadd\t\2, \2, >> >> :lo12:\.LANCHOR0\n\tinsr\t\1, w1\n\tld1w\t(z[0-9]+\.s), p[0-9]+/z, >> >> \[\2\]\n\tinsr\t\1, w0\n\tzip1\t\1, \1, \3} } } */ >> > >> > this fails with tiny model when i'm testing aarch64-none-elf >> > >> > $ make check-c >> > 'RUNTESTFLAGS=--target_board=aarch64-elf-qemu{-mcmodel=tiny} >> > aarch64-sve.exp=init_8.c' >> > ... >> > FAIL: gcc.target/aarch64/sve/init_8.c -march=armv8.2-a+sve scan-assembler >> > \\tmov\\t(z[0-9]+\\.s), w3\\n\\tadrp\\t(x[0-9]+), >> > \\.LANCHOR0\\n\\tinsr\\t\\1, w2\\n\\tadd\\t\\2, \\2, >> > :lo12:\\.LANCHOR0\\n\\tinsr\\t\\1, w1\\n\\tld1w\\t(z[0-9]+\\.s), p[0-9]+/z, >> > \\[\\2\\]\\n\\tinsr\\t\\1, w0\\n\\tzip1\\t\\1, \\1, \\3 >> > >> > i think you need conditional scan asm for { target aarch64_small } >> > and { target aarch64_tiny } or just skip the test for tiny, >> >> Maybe we should remove the address calculation and replace the ld1w >> address with \[[^]]*\]. All that really matters for this test is that >> the vector is loaded from memory. >> >> > but even then matching exact register name and instruction scheduling >> > seems fragile. >> >> The only hard-coded register names are the parameters, which are >> guaranteed by the ABI. Testing for those should be fine. >> >> The dg-options pass -fno-schedule-insns, but I guess they should >> also pass -fno-schedule-insns2. Or maybe just use -O instead. >> We can always revisit this later if even that isn't enough to make >> the order stable. > Thanks for the suggestions. Passing -fno-schedule-insns2 does seem to > make the order stable. > For init_1.c to init_4.c there were no intervening instructions, and > for remaining tests, the patch passes -fno-schedule-insns2 > and adjusts dg-scan accordingly. I verified the tests pass with -mcmodel=tiny.
I think we should use consistent options for all the test though. So either we should add -fno-schedule-insns2 to all of them, or we should switch to -O. TBH -O seems easier :-) (I checked that all tests do still pass with -O.) > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_10.c > b/gcc/testsuite/gcc.target/aarch64/sve/init_10.c > index 9d6e2dfc876..08437e5d8f1 100644 > --- a/gcc/testsuite/gcc.target/aarch64/sve/init_10.c > +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_10.c > @@ -1,5 +1,5 @@ > /* { dg-do assemble { target aarch64_asm_sve_ok } } */ > -/* { dg-options "-O2 -fno-schedule-insns -msve-vector-bits=256 --save-temps" > } */ > +/* { dg-options "-O2 -fno-schedule-insns -fno-schedule-insns2 > -msve-vector-bits=256 --save-temps" } */ > > /* Case 5.4: Interleaved repeating elements and non-repeating elements. */ > > @@ -17,13 +17,14 @@ vnx4si foo(int a, int b, int c, int f) > foo: > .LFB0: > .cfi_startproc > - mov z0.s, w2 > mov z1.s, w3 > + mov z0.s, w2 > insr z0.s, w1 > - ptrue p0.s, vl8 > insr z0.s, w0 > zip1 z0.s, z0.s, z1.s > + ptrue p0.s, vl8 > + st1w z0.s, p0, [x8] > ret > */ > > -/* { dg-final { scan-assembler {\tmov\t(z[0-9]+\.s), > w3\n\tmov\t(z[0-9]+\.s), w2\n.*\n\tinsr\t\2, w1\n\tinsr\t\2, w0\n\tzip1\t\2, > \2, \1} } } */ > +/* { dg-final { scan-assembler {\tmov\t(z[0-9]+\.s), > w3\n\tmov\t(z[0-9]+\.s), w2\n\tinsr\t\2, w1\n\tinsr\t\2, w0\n\tzip1\t\2, \2, > \1} } } */ You're reintroducing the st1w as part of the asms. We should either do that for all the tests or leave it out. Thanks, Richard