On Tue, Jun 10, 2025 at 07:43:20PM +0100, Richard Sandiford wrote: > Spencer Abson <spencer.ab...@arm.com> writes: > > On Mon, Jun 09, 2025 at 02:48:58PM +0100, Richard Sandiford wrote: > >> Spencer Abson <spencer.ab...@arm.com> writes: > >> > On Thu, Jun 05, 2025 at 09:24:27PM +0100, Richard Sandiford wrote: > >> >> Spencer Abson <spencer.ab...@arm.com> writes: > >> >> > diff --git > >> >> > a/gcc/testsuite/gcc.target/aarch64/sve/unpacked_cond_cvtf_1.c > >> >> > b/gcc/testsuite/gcc.target/aarch64/sve/unpacked_cond_cvtf_1.c > >> >> > new file mode 100644 > >> >> > index 00000000000..8f69232f2cf > >> >> > --- /dev/null > >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/sve/unpacked_cond_cvtf_1.c > >> >> > @@ -0,0 +1,47 @@ > >> >> > +/* { dg-do compile } */ > >> >> > +/* { dg-options "-O2 -ftree-vectorize -msve-vector-bits=2048 > >> >> > -fno-trapping-math" } */ > >> >> > >> >> The =2048 is ok, but do you need it for these autovectorisation tests? > >> >> If vectorisation is treated as not profitable without it, then perhaps > >> >> we could switch to Tamar's -mmax-vectorization, once that's in. > >> > > >> > This isn't needed to make vectorization profitable, but rather to > >> > make partial vector modes the reliably obvious choice - and hopefully > >> > one that is isn't affected by future cost model changes. With =2048 > >> > and COUNT, each loop should be fully-unrolled into a single unpacked > >> > operation (plus setup and return). > >> > > >> > For me, this was much more flexible than using builtin vector types, > >> > and easier to reason about. Maybe that's just me though! I can try > >> > something else if it would be preferred. > >> > >> I don't really agree about the "easier to reason about" bit: IMO, > >> builtin vector types are the most direct and obvious way of testing > >> things with fixed-length vectors, for the cases that they can handle > >> directly. But I agree that vectorisation is more flexible, in that > >> it can deal with cases that fixed-length builtin vectors can't yet > >> handle directly. > >> > >> My main concern was that the tests didn't seem to have much coverage > >> of normal VLA codegen. If the aim is predictable costing, it might > >> be enough to use -moverride=sve_width=2048 instead of > >> -msve-vector-bits=2048. > > > > I see - yeah, -moverride=sve_width=2048 is enough. > > > > How about we use builtin vectors wherever possible, and fall back > > to the current approach (but replacing -msve-vector-bits with > > -moverride=sve_width) everywhere else? > > > > Alternatively, if we'd like to focus on VLA codegen, I could > > just replace -msve-vector-bits with -moverride=sve_width throughout > > the series. > > I don't think there's any need to go back and change the way the tests > are written. Just replacing -msve-vector-bits with -moverride=sve_width > for the vectoriser-based tests sounds good. >
Hi, > I see - yeah, -moverride=sve_width=2048 is enough. This was a bit of an oversight from me, sorry. Testing these changes in the order that they are written in using VLA codegen is quite difficult in practice; testing each element size/container size pair requires coercing the vectorizer into choosing a specific VF, this choice can change even as the series itself evolves, and often requires more tuning/tweaking than just -moverride=sve_width. I'm a little worried about pushing potentially flaky tests. If I can't easily follow the reasoning of the cost-model/vectorizer, would you mind if I stick to the original fixed-length format? If the choice seems obvious enough, I think I ought to make sure that nothing silently fails by checking for each extending load pattern, e.g. /* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.d,} n } } */ Thanks, Spencer